[HN Gopher] Show HN: A pure C89 implementation of Go channels, w...
       ___________________________________________________________________
        
       Show HN: A pure C89 implementation of Go channels, with blocking
       selects
        
       Author : Rochus
       Score  : 114 points
       Date   : 2023-12-13 19:31 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | sim7c00 wrote:
       | not being a god level programmer i wont go into quality of code,
       | but this looks to me really neat and easy to use. well done!
        
       | eqvinox wrote:
       | It's 2023 - even MSVC has supported C11 and C17 for a while now.
       | C89 is no longer a feature to advertise, it's an unhelpful
       | constraint forcing poorer quality code. A good example is that
       | CspChan_closed should use the bool type added in C99.
       | 
       | There's nonstandard nomenclature in "_dispose"; if the
       | constructing function is called _create, the common pattern is to
       | pair it with _destroy.
       | 
       | typedefs ending with _t are reserved for standards extensions,
       | though these days people ignore this a lot because it feels like
       | general practice to add _t.
       | 
       | There's no way to pass in a custom allocator, or logging
       | callback, or set some debug logging flags. Maybe something to
       | tackle later.
       | 
       | Both _select functions fall short in that they do not allow
       | select'ing on channels + other file descriptors simultaneously.
       | General library design practice is to have the library return a
       | file descriptor that can be used in whatever event loop the
       | application already uses. (The fd can be a dummy pipe or
       | eventfd.)
       | 
       | All of these things are in the header, so they matter the most as
       | they set ABI and API for any user. Unfortunately getting them
       | right as early as possible is important to avoid breaks; yet very
       | hard to do since often API aspects only become clear after a
       | library has non-trivial users.
       | 
       | [edited to tone down a bit]
        
         | coumbaya wrote:
         | Not dismissing anything you said but bool isn't C89 right ?
         | Back ~6 years ago when I worked in embedded C89, bool was just
         | a #define true (1==1), #define false (0==1) so I guess it makes
         | sense it isn't in the lib ?
        
           | eqvinox wrote:
           | Indeed, bool (or _Bool) was added in C99, hence my pointing
           | out it not being used... it's not in C89.
           | 
           | It matters because the return value sense is different; an
           | "int" return for a status-ish thing in a modern library
           | generally means 0 for success, nonzero for error codes.
           | "bool" is the other way around, 0 for failure-y. In this case
           | it's a status retrieval function so it's pretty clear that
           | it's intended as a boolean but it'd still be better to
           | actually make it bool.
        
             | coumbaya wrote:
             | Ah, I get it now, thx.
        
         | dang wrote:
         | I realize your intention is to provide good feedback and you
         | obviously know a lot, but if you could shift the pH slightly
         | away from acidic, that would be better. Keep in mind that
         | criticisms like this have a tendency to land 10x harder than
         | you intended.
         | 
         | https://news.ycombinator.com/showhn.html
        
           | eqvinox wrote:
           | Sigh. Yes. Valid. I'll go edit it a bit.
        
             | dang wrote:
             | Appreciated!
        
           | drewg123 wrote:
           | This has to be the best comment from a moderator that I've
           | ever seen. This is why I love HN.
        
             | lnxg33k1 wrote:
             | I almost never agree with mods here, but despite that I've
             | got to admit that the way the expose their opinions is one
             | of the best Ive experienced
        
         | oconnor663 wrote:
         | > that's just the header
         | 
         | You're providing a ton of code style feedback. That's fair,
         | code style is important, I'm not saying you shouldn't. But
         | surely when providing style feedback _in public_ you could take
         | a friendlier tone :(
        
           | wiseowise wrote:
           | What's unfriendly in their tone?
        
         | kazinator wrote:
         | > should really be a bool
         | 
         | bool is actually #define bool _Bool. You have to include
         | <stdbool.h> to get it. That's ugly enough not to want to use
         | it.
         | 
         | I'm looking at the April 2023 draft of ISO C. Under "Relational
         | Expressions" I see that the result of an expression like x < y
         | isn't bool, but ... int.
         | 
         | Here is the exact wording, minus a footnote reference:
         | 
         |  _Each of the operators < (less than), > (greater than), <=
         | (less than or equal to), and >= (greater than or equal to)
         | shall yield 1 if the specified relation is true and 0 if it is
         | false. The result has type int._
         | 
         | int is still the Boolean type in C, and null pointers and zeros
         | are still falsy, relational expressions yield 0 or 1 of type
         | int, and #define bool is just a sham for anal retentives.
         | 
         | > _typedefs ending with _t are reserved for standards
         | extensions_
         | 
         | That is inaccurate. It's POSIX that reserves this namespace.
         | Even if you're targeting POSIX, the reservation has next to no
         | practical meaning, and can safely be ignored.
         | 
         | Here is why. What POSIX says is that when new type names will
         | be introduced in the future in POSIX, they will have _t as a
         | suffix. Well, so what? New type names in ISO C will also have
         | _t suffixes, yet ISO C doesn't say anything about that being
         | "reserved".
         | 
         | When a new identifier is introduced, it has to start and end in
         | _something_.
         | 
         | Whenever POSIX introduces a new public identifier in the
         | future, that identifier will either start with "a", or else
         | with "b", or with "c" ... does that mean that we should stop
         | using all identifiers in order not to tread on a reserved
         | space?
         | 
         | When you have a language without namespaces/packages, you just
         | live with the threat of a clash and deal with it when it
         | happens.
         | 
         | Name clashes are not just with standards like ISO C and POSIX
         | but vendor extensions and third-party code.
         | 
         | The mole is only a problem when it rears its head, and that's
         | when you whack it.
        
           | eqvinox wrote:
           | > bool is actually #define bool _Bool. You have to include
           | <stdbool.h> to get it. That's ugly enough not to want to use
           | it.
           | 
           | Sure, which is why it got changed to a keyword in C23. I'll
           | agree this should have happened earlier.
           | 
           | The return type of comparison operators is entirely
           | irrelevant; you don't build APIs by reference to the return
           | type of a comparison operator. It makes a difference to the
           | reader whether your source code says "int" or "bool", that
           | alone is all the reason anyone should need.
           | 
           | > That is inaccurate. It's POSIX that reserves this
           | namespace.
           | 
           | You got me there. However, the question is, why would you add
           | the _t? It serves no purpose. Typedefs have their own
           | namespace anyway, you're not working around some possible
           | collision by adding the _t.
           | 
           | That said I've already noted this is a commonly ignored
           | aspect. I suppose it "feels" better/correct to some readers.
           | I will happily agree this is the weakest point of my feedback
           | either way, yet I still rather point it out and have people
           | learn more details about this.
        
             | kazinator wrote:
             | > _The return type of comparison operators is entirely
             | irrelevant_
             | 
             | It is entirely relevant. The boolean type of the language
             | is whatever is the type of (0 < 1).
             | 
             | > _Typedefs have their own namespace anyway_
             | 
             | Typedefs positively do not have their own namespace.
             | 
             | If you write                 #undef getc       {
             | typedef char getc;            }
             | 
             | you cannot call the getc function in that scope; it is now
             | the typedef.
        
         | sylware wrote:
         | Even C has already a syntax way too rich and complex (and C11
         | and C17 tantrums are making things worse).
         | 
         | Integer promotion should go, like implicit cast for anything
         | except void* and literals, but we are missing a dynamic/static
         | casts syntax explicit split. Only one loop keyword, loop {}, no
         | switch, no anonymous block, no "a?b:c" operator. typedef has to
         | go, (and typeof,generic,etc), the variable arguments of
         | preprocessor function should be defined once and for all.
         | __thread must go as tls should be managed dynamically and
         | explicitely with the system interface, never statically and
         | hidden by the runtime. Only sized primitive types (u8/s8...).
         | That said, anonymous union/struct are very nice for complex
         | memory layout.
         | 
         | And all the things I am forgetting right now.
         | 
         | With the pre-processor and coding discipline you can get close
         | to that already, but I was told that what I am describing is
         | basically the simplicity of rust syntax, true? Namely it is
         | easier to write a naive rust compiler than a C compiler?
        
           | IlliOnato wrote:
           | What's wrong with ternary operator ("a?b:c") ? It seems many
           | people hate it but I've never seen a reason for it.
           | 
           | This operator is very useful and uncontroversial in Perl,
           | it's normal (and quite readable) in selecting a value for
           | assignment. One is expected to use good practices with it,
           | but it is nice.
           | 
           | Is the problem the fact that in C you can use it instead of
           | if, selecting not values but actions, like (a > b) ?
           | printf("A") : printf("B");
        
             | zogrodea wrote:
             | I (a different person replying to your post) don't mind
             | ternary operators, but I do prefer if-expressions which
             | convey the same thing semantically but can be nicer to read
             | (and also way better when chaining else-if ladders - nested
             | ternaries can be a nightmare to read).
             | https://stackoverflow.com/a/46843369
        
             | dgfitz wrote:
             | I don't like them because they're easy to write and a pain
             | to deal with later. My biggest irritation comes when I'd
             | like to break inside the if or else clause of a ternary and
             | can't. There is also more mental overhead to parse and keep
             | in the mental stack.
             | 
             | I have also seen them used cleanly and beautifully, but
             | that is a rare occurrence.
        
         | MrRadar wrote:
         | Also, as someone who wrote code that had to support C89-only
         | compilers as recently as 5 years ago, I noticed immediately
         | that this code mixes code and variable declarations which
         | strictly C89 compatible compilers will barf on. That said, this
         | code also uses pthreads which is not exactly as portable as
         | "pure C89" implies either.
        
         | tomcam wrote:
         | > It's 2023 - even MSVC has supported C11 and C17 for a while
         | now. C89 is no longer a feature to advertise
         | 
         | There are plenty of constrained environments where this is
         | indeed a feature to advertise, namely older machines and
         | embedded systems.
        
           | 38 wrote:
           | > older machines
           | 
           | Anything from the 80s/90s isn't really worth supporting. We
           | gotta draw the line somewhere.
        
           | fweimer wrote:
           | But it's not actually written in C89. It uses POSIX threads,
           | flexible array members, anonymous unions, bitfield members
           | that are not int, declarations intermixed with statements.
           | And that's just what "gcc -std=c89 -pedantic-errors" reports.
        
         | pengaru wrote:
         | The standards extensions reservation of _t is such a non-issue
         | I don't understand why people even bother mentioning it
         | anymore.
         | 
         | It's become such a common practice to suffix your typedefs with
         | _t it's practically a de facto standard at this point.
         | 
         | Since everyone's already namespacing types with some kind of
         | prefix, I don't see the problem nor have I ever experienced a
         | negative consequence after decades of writing C this way.
        
         | neverartful wrote:
         | No good deed goes unpunished.
        
         | Galanwe wrote:
         | > It's 2023 - even MSVC has supported C11 and C17 for a while
         | now. C89 is no longer a feature to advertise, it's an unhelpful
         | constraint forcing poorer quality code.
         | 
         | Oh come on, let's not play the language police... You are
         | entitled to your language likings, others are not. C89 is
         | simple, straightforward, no bells and whistles, and some people
         | like that (I do).
         | 
         | > There's nonstandard nomenclature in "_dispose"; if the
         | constructing function is called _create, the common pattern is
         | to pair it with _destroy.
         | 
         | "Nonstandard" says who?
         | 
         | In my 20 years of C I've seen all possible combinations of
         | _init/_create/_new/_alloc
         | _free/_deinit/_release/_delete/_destroy/_dispose. As long as
         | it's consistent across the codebase, it's fine.
         | 
         | > typedefs ending with _t are reserved for standards
         | extensions, though these days people ignore this a lot because
         | it feels like general practice to add _t.
         | 
         | Not really. The usual convention for _t is to differentiate
         | between raw struct names and typedef structs, such that you
         | know whether you have to prefix "struct" during declaration.
         | i.e. `struct foo {}`/`typedef struct {} foo_t`
         | 
         | > Both _select functions fall short in that they do not allow
         | select'ing on channels + other file descriptors simultaneously.
         | 
         | Good point
        
         | dekhn wrote:
         | I feel like this comment could be best structured in the form
         | of a series of commits sent as a combined Merge Request.
         | 
         | First, a commit to change the docs to say it's a pure C99 (or
         | C11 or C17) implementation.
         | 
         | Second, for each of your points, a single commit fixing each
         | style issue (_dispose -> _destroy, typedefs).
         | 
         | Seperately, another MR with the commit for _select
         | improvements.
         | 
         | Or maybe just send the C99 one first, and if that gets
         | rejected, don't bother with the rest.
        
         | jansommer wrote:
         | MSVC lacks some C99 features like variable length arrays and
         | complex numbers. There's always a work around, but wouldn't
         | want to use that compiler for C unless I had to.
         | 
         | Going for C89 for a truly portable project is probably fine.
        
       | vore wrote:
       | while (!CspChan_closed(chan)) { ... } seems like a concurrency
       | footgun - what's stopping the channel from being closed in
       | between the check for it being closed and the operation on the
       | channel?
        
         | withinboredom wrote:
         | And further, what if there are messages still in the closed
         | channel? Do they just go "poof"? It's fine if they do, but that
         | should be documented.
        
       | rockwotj wrote:
       | Related is libmill, which has been around for awhile and is was
       | previously discussed on HN:
       | https://news.ycombinator.com/item?id=30699829
       | 
       | libmill supports a bunch of stuff like sockets, timers, files,
       | etc.
        
       | pjmlp wrote:
       | Basically what ended up replacing Alef in Plan 9.
       | 
       | https://9p.io/magic/man2html/2/thread
        
       | bufo wrote:
       | Great! I was looking into something like this. I assume ending up
       | with epoll will be better?
        
         | eqvinox wrote:
         | Considering that epoll is Linux specific anyway, I would highly
         | advise going straight to io_uring. epoll has a whole bunch of
         | footguns in particular with edge triggered modes of operation;
         | io_uring has a higher initial threshold in understanding how it
         | works but is worth that effort.
         | 
         | (Unless you need to support older Linux kernels that have epoll
         | but no io_uring yet.)
        
           | bufo wrote:
           | Oh yeah I meant io_uring too. Plus Windows copied it so you
           | can implement things very similarly for Windows.
        
       | mmcgaha wrote:
       | When it said pure C89 I figured it was going to have some setjmp
       | and longjmp going on instead of threads.
        
       | samsquire wrote:
       | Wow, thank you for this. Good work!
       | 
       | Some thoughts:
       | 
       | I've often thought that unbuffered channels would cause
       | scheduling thrashing - higher latency and lower throughput
       | because you're swapping between stopping and starting processes
       | blocked on a channel frequently. If you're sending just a small
       | piece of data like 64 bit integer at a time, or any kind of
       | pattern where you're using threads to break up work into tasks,
       | this is too small breakdown of task to really scale
       | multithreading. Want to communicate something that causes a LARGE
       | AMOUNT of work on the other thread, to keep the processor busy.
       | But there's balance between latency and throughput, if you send a
       | big task you get higher latency to react to the next task but
       | better throughput.
       | 
       | Walking through my thinking and help me understand: If you have
       | multiple channels that you could read from in a select call but
       | none are ready, could you block that select instance and process
       | a different process where other selects are potentially waiting?
       | This is similar to blocking a goroutine or a Rust async Future
       | task in Tokio that needs to be waked. I think this would need a
       | scheduler. EDIT: Your scheduler to switch between "select
       | instances" or what is running in a thread is the OS.
        
       | galkk wrote:
       | Even looking at simple code example, with casting from/to void _,
       | returning 0 as void_ etc, I cannot understand how people are
       | calling C easy and productive language
        
         | vngzs wrote:
         | That claim isn't made on the linked site, though. I think
         | people who write C nowadays tend to be working in the drivers
         | and OS development spaces.
        
         | salawat wrote:
         | You're making bits dance through hardware with minimal overhead
         | or layers of abstraction/other programmer's opinions to deal
         | with.
         | 
         | I call that productive.
         | 
         | Further, you have the smallest API/ABI stdlib to work through
         | the quirks of of any language. Again. Productive.
         | 
         | I, of course, take the precaution of allocating time to blow up
         | 5/8 of a solar system/a star into my projects. I have a
         | tendency to find ways to do things other than intended, and
         | like to absorb the lessons.
         | 
         | As a result, I have quite the collection of ways not to do
         | things, or how to do things other than I intended. Just because
         | your management fu disagrees with my definition of productive
         | is not my concern.
        
         | zeroCalories wrote:
         | Despite all of the scary casting, there is fairly little you
         | need to know about C to understand what's going on.
        
         | SAI_Peregrinus wrote:
         | C is a _small_ language. It 's not simple, nor is it easy.
         | 
         | Brainfuck is a tiny language.
         | 
         | C++ is a gigantic language.
         | 
         | Language size and ease of use are not directly related.
        
       | p_l wrote:
       | fun fact - Go Channels started out as pretty thin syntax sugar
       | over features provided by Plan9's C libthread.
        
       | Zambyte wrote:
       | This is really cool!
       | 
       | I do want to say though, if you're considering using this for a
       | production system: it's worth also considering ZeroMQ inproc
       | sockets[0]. They allow for very similar semantics to this, with
       | the added benefit of trivially being able to migrate to an inter-
       | process / network channel, by just changing the URL to bind /
       | connect to.
       | 
       | [0] http://czmq.zeromq.org/
        
       | Matthias247 wrote:
       | libmill (https://github.com/sustrik/libmill) and libdill
       | (https://github.com/sustrik/libdill) should be similar and
       | probably mentioned.
       | 
       | As far as I understand the differences between CspChan and
       | libmill might be that libmill also implements lightweight tasks
       | (coroutines) and everything that goes with it (IO multiplexing,
       | async timers, etc), while CspChan uses OS threads?
        
       | segmondy wrote:
       | if you don't get anything out of this and don't know Hoare's
       | work, go read CSP - http://www.usingcsp.com/cspbook.pdf
        
       | openasocket wrote:
       | Very interesting! I program a lot in Go, and while I have my
       | complaints about the language, I find the CSP model to be very
       | easy to work with.
       | 
       | I'd recommend adding some additional documentation about the API.
       | You mention trying to keep dynamic allocations to a minimum
       | (which I like to hear) but it would be handy to have
       | documentation stating which functions allocate and how much they
       | allocate. Actually, more documentation in general would be nice,
       | particular relating to edge cases. In Go I know that sending to a
       | closed channel will panic, for example. But what will your
       | library do? Return some sort of error message? Silently drop the
       | message? Segfault? Definitely something you want prominently
       | documented. Especially anything that could potentially segfault
       | should be highlighted.
       | 
       | Oh, and some benchmarks would be very interesting!
        
       | iainmerrick wrote:
       | Sorry for going a bit off-topic, but something I'm curious about:
       | are channels thought to be a good/useful tool for concurrent
       | programming?
       | 
       | My feeling is that they're probably too low-level and error-
       | prone, and you really want higher-level structures like worker
       | pools or actors. So as a concurrency building block they'd be on
       | the same abstraction level as pthreads or Java monitors -- nice
       | and flexible for building on top of, but too finicky to be used
       | _directly_ in application code.
       | 
       | But I'm not a Go expert, and maybe Go programmers do successfully
       | use channels directly for application code?
        
         | JyB wrote:
         | The opposite. Channels are often used to orchestrate higher-
         | level structures/abstractions; not 'low-level' stuff where more
         | common primitives such as mutexes/semaphores are sometime
         | preferred.
        
       | kazinator wrote:
       | 1. There is no <memory.h> in any version of ISO C, nor in POSIX.
       | 
       | 2. This is not C89; at best GNU C 89, because a variable is
       | declared after a statement:                  {          if(
       | msgLen == 0 )             msgLen = 1;          /* queueLen == 0
       | is an unbuffered channel, but we still need one slot to transport
       | the message */          CspChan_t* c =
       | (CspChan_t*)malloc(sizeof(CspChan_t) + queueLen*msgLen);
       | 
       | To help enforce that you're actually writing C89, you should set
       | your compiler to C89 and turn on whatever additional diagnostics
       | may be required like -pedantic.
        
       ___________________________________________________________________
       (page generated 2023-12-13 23:00 UTC)