[HN Gopher] Io_uring By Example: cat, cp and a web server with i...
       ___________________________________________________________________
        
       Io_uring By Example: cat, cp and a web server with io_uring
        
       Author : shuss
       Score  : 139 points
       Date   : 2020-04-06 15:08 UTC (7 hours ago)
        
 (HTM) web link (unixism.net)
 (TXT) w3m dump (unixism.net)
        
       | Aissen wrote:
       | It might be a newbie question, but why use fputc and outputing
       | character by character instead of using liburing to write
       | directly to fd 1 ? (or writev?)
        
         | shuss wrote:
         | Yes, you could do that. But I wanted to keep the example
         | programs in the beginning very simple. I the same vein, they
         | only deal with one request at a time, for instance.
        
           | Aissen wrote:
           | Makes sense, thanks!
        
       | lazerl0rd wrote:
       | I maintain a NGINX fork with io_uring support (instead of AIO)
       | here https://github.com/lazerl0rd/nginx. io_uring has been
       | providing promising results, and recent updates to Linux have
       | just been constantly improving it.
        
       | graetzer wrote:
       | Probabbly don't really know what I am talking about, since only
       | ever used async IO through boost.asio:
       | 
       | Isn't it a bit crazy that there are so many different ways of
       | doing async IO on linux alone ? You would think that this is a
       | more or less solved problem by now.
        
         | knome wrote:
         | This introduction to io_uring has some history of the interface
         | that may answer your question.
         | 
         | https://kernel.dk/io_uring.pdf
        
         | kccqzy wrote:
         | No, I don't think so.
         | 
         | The terminology here is a bit confusing because different
         | people have defined "async" differently, but in the strictest
         | sense of the word, before io_uring the only other way to do
         | async IO is POSIX AIO. See aio(7) in your man pages:
         | http://man7.org/linux/man-pages/man7/aio.7.html
         | 
         | On the other hand, using select(2), poll(2), and epoll(2) is
         | not really async IO.
         | 
         | The difference is really simple: for non-async IO, whenever you
         | use read(2) or write(2) and that call returns without an errno,
         | that operation is decidedly complete from the perspective of
         | the user space. The buffer might not actually make it to disk
         | (say because of caching) or the network (say because of the
         | Nagle algorithm). But really from the user space perspective it
         | is done.
         | 
         | What about "async" libraries in user space? All the bells and
         | whistles added by these libraries in user space merely give you
         | support for knowing when a file descriptor is ready. It doesn't
         | make the actual reading or writing asynchronous. So with a such
         | library, your code might appear to call read(), but the
         | framework knows that the file descriptor isn't actually ready,
         | suspends your greenthread/fiber/coroutine or whatever it uses
         | and does something else. The actual read(2) call doesn't
         | happen. The kernel doesn't know you want to read.
         | 
         | With io_uring, you actually deal with read or write requests
         | sent to the kernel that are incomplete. You actually tell the
         | kernel you want to read (by writing to a ring buffer, hence the
         | name). The kernel knows you want to read. That's the
         | difference.
         | 
         | Now don't get me wrong but I think for the majority of
         | applications you don't need true async IO. All you really need
         | is efficient notification of whether a file descriptor is
         | ready. So AIO and io_uring are both niche topics that likely
         | won't affect your app.
        
           | loeg wrote:
           | Really importantly, the behavior of poll/select/epoll on
           | _files_ is to always return  "available." But a read from
           | that file may still sleep waiting on disk -- available does
           | not mean the result is already in memory. They're only really
           | effective primitives for networking sockets and artificial
           | fds like signalfd().
           | 
           | Prior to io_uring, libraries that provide async file access
           | in userspace (by necessity) use some kind of threadpool, with
           | each thread processing an operation synchronously and
           | providing an async result via self-pipe or other user-driven
           | event.
        
       | aseipp wrote:
       | Nice set of articles, thank you! io_uring is actually pretty easy
       | to use with liburing, and seeing more people adopt it is
       | exciting.
       | 
       | The problem with io_uring, really, is that every kernel release
       | has now become much, much more exciting than the last one due to
       | all the improvements (c.f. 5.7 now having buffer selection
       | primops, big deal!) I'm constantly, anxiously, ever-awaiting the
       | next kernel release... :(
        
       | Matthias247 wrote:
       | > the readv() call will block until all iovec buffers are fil
       | led with file data. Once it returns, we should be able to access
       | the file data from the iovecs and print them on the console.
       | 
       | This is wrong. readv() can return as soon as a single byte had
       | been read, in a similar fashion as read(). If you need to read
       | all bytes you have to use a loop. This program is potentially
       | printing non-initialized memory.
       | 
       | I think the same applies to the uring examples, which also don't
       | seem to check the actual processed bytes.
       | 
       | I'm also actually not sure if in the uring version requests are
       | guaranteed to be processed in order or not. I would have assumed
       | the latter - and thereby the printed result could have been out
       | of order.
       | 
       | Obviously the demos also miss all the required memory management
       | - but I guess that was intended. But if we would add it, the
       | uring versions would increase more in complexity than the
       | synchronous version.
       | 
       | ==> IO is hard. uring is exciting for high performance use-cases,
       | but will rather make it harder than easier. However ideally most
       | end-users would not have to use it directly (and neither
       | liburing), but instead use an async/await/coroutine framework on
       | top of it that makes the asyncness transparent to the application
       | and allows to avoid most of the pitfalls.
        
         | yxhuvud wrote:
         | > I'm also actually not sure if in the uring version requests
         | are guaranteed to be processed in order or not. I would have
         | assumed the latter - and thereby the printed result could have
         | been out of order.
         | 
         | You assume correctly. There is no ordering guarantee, unless
         | events are chained (which is a special flag).
        
       | unlinked_dll wrote:
       | I almost certainly have no idea what I'm talking about but
       | 
       | Can I write a device driver using io_uring? As in talk to i2c
       | endpoints instead of normal files?
        
         | shuss wrote:
         | Nope. io_uring is an API that helps with asynchronous I/O.
        
           | unlinked_dll wrote:
           | That's kind of my question. I want async I/O with a device,
           | can io_uring help?
        
             | navaati wrote:
             | As neighbour says, io_uring is for really high performance
             | (as in high throughput) stuff, so... not I2C !
             | 
             | But you can still want async I/O with an I2C device, of
             | course, as in you don't want to block a thread to wait on a
             | message. And for that I believe you can still use good old
             | select/epoll as usual on your I2C device file, and as a
             | consequence also just use your favourite async I/O
             | framework of the day (libevent, node.js, what have you).
        
             | asveikau wrote:
             | Does your device have a file descriptor, as is traditional
             | in Unix? Seems like the theoretical answer should then be
             | yes.
        
             | Aissen wrote:
             | If you have a character/block device that does read/write
             | ops, you'd probably get io_uring support for free.
        
         | sly010 wrote:
         | I don't know the answer to that question, but if you are
         | developing an embedded project (and especially if it's a
         | raspberry PI) you can mmap() the hardware I2C registers to
         | userspace memory and use them directly. You will have to
         | basically write the I2C driver in userspace and it won't be
         | portable but it's not that hard and it will make for very low
         | latency I2C communication. I had to do this once for latency.
        
         | Skunkleton wrote:
         | i2c and high performance async IO are pretty rarely overlapping
         | requirements. I'm curious what your use case is?
        
         | Veserv wrote:
         | From what I understand and looking at the supported opcodes [0]
         | io_uring allows submitting certain "system calls" (and some
         | other auxiliary operations you would normally expect in an
         | async API) asynchronously. So, if you can interact with your
         | device driver using those system calls, then you should be able
         | to do so. I am personally not familiar with Linux, so I am
         | unsure if those system calls are sufficient for normal
         | operation, but it lacks ioctl, so it is not sufficient for
         | total replacement in all cases.
         | 
         | [0]:
         | https://github.com/torvalds/linux/blob/master/include/uapi/l...
         | search IORING_OP_ _
        
       | frevib wrote:
       | Here is an echo server that uses some of the 5.7 features:
       | 
       | https://github.com/frevib/io_uring-echo-server/tree/io-uring...
       | 
       | One of the 5.7 features, IORING_FEAT_FAST_POLL, gives a (free)
       | performance boost of up to 68% compared to epoll:
       | 
       | https://twitter.com/hielkedv/status/1234135064323280897?s=21
        
       | dirtydroog wrote:
       | Does it support timers / timeouts?
       | 
       | Thanks for the article, it's very appealing to start to play
       | around with this.
        
         | shuss wrote:
         | Yes. Please see io_uring_wait_cqe_timeout() in liburing.
        
         | yxhuvud wrote:
         | Yes, assuming you have a fresh enough kernel.
        
       | megous wrote:
       | This seems to assume there's always space in the ring during
       | submission at various places in the code.
        
       ___________________________________________________________________
       (page generated 2020-04-06 23:00 UTC)