[HN Gopher] Learn Rust with entirely too many linked lists (2019)
       ___________________________________________________________________
        
       Learn Rust with entirely too many linked lists (2019)
        
       Author : pcr910303
       Score  : 260 points
       Date   : 2020-02-22 12:29 UTC (10 hours ago)
        
 (HTM) web link (rust-unofficial.github.io)
 (TXT) w3m dump (rust-unofficial.github.io)
        
       | brundolf wrote:
       | When I was first getting started with Rust this tutorial was eye-
       | opening. It gave me a clear view into the somewhat impenetrable
       | world of working with Boxes (pointers to the heap) directly, in
       | the context of the borrow-checker.
       | 
       | It also convinced me that you usually just want to use the
       | standard library data structures if you can :P
        
       | xlap wrote:
       | Basically anything published about Rust is turning me away from
       | the language.
       | 
       | It could be unfair though: Technical writing (and that includes
       | humorous opinionated pieces) has declined dramatically in the
       | last 10 years.
       | 
       | Or perhaps writing as a whole has declined.
        
         | Dylan16807 wrote:
         | It wouldn't hurt to be specific.
        
       | zackify wrote:
       | Can't wait to follow this. I've been going through the official
       | book these past couple weeks. As someone who's never used systems
       | languages (mainly a js / node person) I was able to write a
       | redis-like database quickly on top of TCP. I've also picked up
       | async / await and how it works in rust.
       | 
       | There's so much to learn but the compiler is surprisingly
       | helpful. I loved Typescript and it is like TS on steroids. When I
       | get stuff compiling I have so much confidence.
        
       | SAI_Peregrinus wrote:
       | Lots of these comments seem confused about the purpose of this.
       | It's not about using linked lists (that's easy, they're in
       | std::collections), it's about implementing them. So the common
       | non-kernel/embedded use cases are well supported, just use
       | std::collections::LinkedList! But if you're in a no-std context
       | then you might need this.
        
       | caconym_ wrote:
       | A while ago now, this was the tutorial that got me to the point
       | where I could implement linked structures in Rust. Highly
       | recommended.
        
       | kybernetikos wrote:
       | I really value this guide.
       | 
       | Writing data structures in a language is one of the standard ways
       | that I convince myself that I understand it. Rust really messes
       | hard with that mindset.
       | 
       | This guide talked me through all the things I'm not used to
       | worrying about in garbage collected languages, showed me the
       | error messages that I'd been banging my head against, explained
       | what they meant, and did so with humor.
        
       | winrid wrote:
       | I'll try this at some point. My first couple forays into Rust
       | made me miss Java and C.
        
       | wcrichton wrote:
       | Another great space of exercises in this vein are in-place
       | operations on binary trees.
       | 
       | Specifically, rebalancing [1] proved particularly tricky for my
       | students. I think it's a good litmus test for whether you
       | understand ownership, borrowing, and algebraic data types.
       | 
       | [1] http://cs242.stanford.edu/f19/assignments/assign6/#22-bst-
       | in...
        
       | matthewaveryusa wrote:
       | >You're doing some awesome lock-free concurrent thing.
       | 
       | I can confirm lists are pretty rarely used. The four times I've
       | used a linked list in 10 years of professional programming were:
       | 
       | -quick&dirty hashmap in C
       | 
       | -threadsafe queue
       | 
       | -keeping track of a set of objects that can't be copied (
       | threads)
       | 
       | -LRU cache
       | 
       | In C++ at least, the nice thing about a list is you can push/pop
       | in the front/back without a reallocattion or invalidating
       | iterators.
        
         | chongli wrote:
         | Linked lists are used a lot in game programming where the
         | worst-case behaviour of std::vector and similar structures is
         | undesirable. Linked lists may be slow for a variety of reasons
         | but they're simple and predictable which is very nice when
         | you're trying not to miss the 16.67ms per-frame window.
        
           | SAI_Peregrinus wrote:
           | Used, yes, butbnot implemented. std::collections::LinkedList
           | is a doubly-linked list suitable for games. It's basically
           | just the no-std crowd that might need to implement their own.
        
           | estebank wrote:
           | I thought it was the other way around: games using arrays,
           | non arbitrarily growable vectors to precisely be able to
           | iterate over everything quickly in a deterministic fashion.
           | Specially because a lot of what games have to deal with is
           | "visit every node".
        
             | chongli wrote:
             | They use arrays for static vertex, texture, etc. data to
             | send to the GPU. They don't use arrays for things that are
             | being created and destroyed all the time, such as units in
             | a strategy game, they use lists for those things.
        
               | estebank wrote:
               | But do they use linked lists for them? I would have
               | assumed you would use an arena for something like that,
               | precisely because you already need to iterate over all of
               | them and can represent any graph as keys into a
               | generational index, so that reading stale keys is
               | trivially handled.
               | 
               | Again, I'm not a game dev and this is just my
               | understanding after reading up on these topics after
               | watching https://youtu.be/aKLntZcp27M.
        
         | big_chungus wrote:
         | I've used linked lists a little more because I mostly do C
         | rather than C++. Sometimes I end up using it as a quick-and-
         | dirty vector when I don't want to bother writing something or
         | using a library, though I typically use some kind of
         | block/arena allocator vs. just malloc.
        
       | Animats wrote:
       | Doubly-linked lists are hard with Rust's ownership system. I've
       | argued for Rust having backpointers as a built-in type the borrow
       | checker understands. You have to maintain the invariant that if A
       | points to B, B points back to A. A is an owning pointer, and B is
       | a non-owning pointer locked in a relationship with A. Easy to
       | check if the language lets you say that's what you're doing. Rust
       | lacks that.
       | 
       | If you have safe backpointers, most tree-type data structures
       | with backpointers can be constructed. A nice feature to have.
        
         | magicalhippo wrote:
         | As a non-Ruster, any particular reason they didn't include
         | this? A lot of the code I've done over the years involve
         | graphs, which have a lot of circular pointer and/or
         | backpointers. I find it kinda weird they'd make such a common
         | thing a PITA in a new language.
        
           | eximius wrote:
           | Rust has a concept of ownership and borrowing. (I think)
           | everything is fine when you just have immutable references
           | everywhere. But as soon as you start taking mutable
           | references (of which only one can exist at a time) or
           | ownership (stronger than mutable references), then cycles
           | break things.
           | 
           | You can get around it with Interior Mutability, it's just
           | slightly more verbose.
           | 
           | Also, it isn't really that they made an easy thing hard.
           | Properly maintaining the invariants with cyclical references
           | is hard and unsafe. Rust exposes that difficulty in a very
           | literal way
        
           | dodobirdlord wrote:
           | Rust's ownership rules aren't for the purpose of making the
           | user's life hard, they are the least restrictive system that
           | could be devised that allow the compiler to uphold Rust's
           | safety guarantees. It was not too long ago that it was common
           | knowledge that a programming language either has a garbage
           | collector or has manual memory management, or possible both.
           | Safe Rust has neither, but not without the effect of making
           | awkward certain kinds of programming that are fundamentally
           | difficult to make safe.
           | 
           | The Rust question to ask here is: Do you really need
           | _pointers_ , specifically? Or would some other reference
           | mechanism work? You could put all of your graph nodes into a
           | linear data structure like an array, and have them point to
           | each other by holding a list of indices instead of a list of
           | pointers. Or you could give all of your nodes unique keys and
           | keep them in a table, and have them hold references to each
           | other by key. The compiler will not try to prove the
           | correctness of your graph algorithm, and in the event of
           | programmer error that leads to dangling references your
           | program will have to handle the scenario of following an
           | index or a key and not finding a value, so bugs will not
           | introduce memory unsafety.
           | 
           | There's also ongoing work on memory arenas in the nightly
           | compiler, and I believe some libraries. Putting a graph into
           | an arena is a good way to appease the compiler, because the
           | entire arena will be freed at the same time, ensuring that
           | hanging pointers will never exist between graph nodes.
        
           | lrem wrote:
           | In your graph, the nodes are not owning each other, are they?
        
             | magicalhippo wrote:
             | No, but they do point to each other. For example, each node
             | can have an array of all the edges, and each edge has two
             | pointers to each node it connects. Via this you can walk
             | around cycles for example. And typically these can change
             | when nodes are inserted or removed.
        
           | Animats wrote:
           | This is an enforced invariant. This results in proposals that
           | it should be done by adding a system for general invariants
           | and design by contract features. Those are then rejected as
           | too complicated and something for the far future.
           | 
           | So, the usual answer is "use unsafe code." What could
           | possibly go wrong?
        
           | moth-fuzz wrote:
           | Historically, the classic graph structures are adjacency
           | lists and adjacency matrices. Those are quite easy to
           | represent in Rust if you use indices instead of pointers.
           | Node/Pointer type graph structures aren't particularly
           | efficient anyway, even in C or C++.
        
           | habitue wrote:
           | It might actually be possible now with Pin. Could anyone
           | actually knowledgeable about this comment?
        
             | renox wrote:
             | I don't know much about Rust (and nearly nothing about Pin)
             | but if this was true, don't you think someone would already
             | have written a library providing doubly linked list without
             | using unsafe?
        
         | dpbriggs wrote:
         | You can somewhat emulate this with Weak pointers:
         | 
         | https://doc.rust-lang.org/std/rc/struct.Weak.html
         | 
         | They don't count against ownership, but do bring some extra
         | headaches of their own (referencing counting overhead, etc).
        
       | ubercow13 wrote:
       | This is great as an intro to Rust, I preferred this as a Rust
       | starter tutorial to the main rust book or any other tutorial I
       | tried
        
         | Waterluvian wrote:
         | I'm loving both. The book is comprehensive and teaches you even
         | the most basic concepts, so it's great for a broader skill
         | range.
         | 
         | But I love this one too because it digs into some CS
         | archaeology and that helps me dig into the theory and history.
         | I doubt I'll ever have to implement a linked list but knowing
         | how they work and are implemented and their advantages and
         | drawbacks is great.
         | 
         | Tangentially, there's a wonderful game called Human Resource
         | Machine which teaches linked lists and other assembly-like
         | programming without you even realising it.
        
           | ubercow13 wrote:
           | The book is very comprehensive but it didn't click for me as
           | a beginner, I found much of the exposition left me with
           | unanswered questions while it went on to cover more ground.
           | I'm not sure if those questions were answered later but I got
           | lost very quickly. Maybe it's aimed at people with more C++
           | background?
           | 
           | This linked list tutorial answered every question I thought
           | of almost exactly as I thought of them, which made it a joy
           | to read. I also liked Rust By Example [1] over the book.
           | Based on my experience I'd recommend this tutorial and then
           | implementing something using Rust By Example as reference.
           | But everyone learns differently!
           | 
           | [1] https://doc.rust-lang.org/rust-by-example/
        
       | geraldbauer wrote:
       | A little more light-hearted but the same style to learn Ruby with
       | Entirely Too Many Fizz Buzzes. See
       | https://yukimotopress.github.io/fizzbuzz
        
       | ajross wrote:
       | > Mumble mumble kernel embedded something something intrusive.
       | 
       | And this is why kernel/embedded/something/something developers
       | don't take Rust as seriously as you want them to.
       | 
       | You can't simultaneously declare your language the best choice
       | for system software development _and_ treat the long-evolved
       | patterns of those paradigms as a joke.
       | 
       | There are very good reasons for intrusive data structures, not
       | least of which being their ability to operate in contexts where
       | no heap is available. If you don't understand them or don't want
       | to talk about them or want to limit your discussion to situations
       | with different requirements, then say so.
        
         | jasondclinton wrote:
         | https://github.com/Amanieu/intrusive-rs implements what you're
         | looking for with no heap. There's no need to flame or
         | misrepresent what the book says (no one said embedded was a
         | joke). The Rust embedded community is strong.
        
           | ajross wrote:
           | You're like the sixth person to argue I'm somehow
           | "misrepresenting" what is being said in this book? I'm
           | quoting it directly. The "flame" is in the original, and I'm
           | responding to it.
           | 
           | Take it out. It's bad.
        
             | jasondclinton wrote:
             | Assuming that you're not trolling, I'll explain how you
             | misinterpreted what you read: the section that you quoted
             | is a sub-heading under:
             | 
             | > An Obligatory Public Service Announcement
             | 
             | which is an argument that:
             | 
             | > Linked lists are as niche and vague of a data structure
             | as a trie.
             | 
             | The author then goes on to state that many people have
             | contacted him to argue that linked lists are not niche and
             | he puts each of those arguments under a heading:
             | 
             | > Mumble mumble kernel embedded something something
             | intrusive.
             | 
             | is one such heading. Inside of it, he continues the
             | argument that _linked lists for embedded no-heap scenarios
             | are niche_ and--to continue his argument--we therefore
             | shouldn 't be teaching undergrads linked lists just like we
             | don't teach them tries.
        
         | robbyt wrote:
         | To explain what I think this comment means. When working on
         | embedded systems, you can interact with hardware devices by
         | writing directly to specially mapped memory areas.
         | 
         | E.g., if you want to write text to a small screen, the kernel
         | driver gives you a memory region that you write bytes to, and
         | they're shown on the screen immediately, without requiring the
         | CPU.
        
           | monocasa wrote:
           | They're saying more that intrusive data structures are really
           | nice in situations that restrict heap allocation. The memory
           | overhead outside of the structures linked together is O(1),
           | because all of the per object metadata is stored in the
           | objects themselves. That means consumers generally don't have
           | to be hardcoded for certain number of objects like you would
           | for an array/vector that doesn't have access to a heap.
        
           | redrobein wrote:
           | > To explain what I think this comment means. When working on
           | embedded systems, you can interact with hardware devices by
           | writing directly to specially mapped memory areas.
           | 
           | And why can't you do this in rust?
        
             | jasondclinton wrote:
             | You can. Rust allows full memory addressing.
        
             | lawn wrote:
             | There's nothing to prevent you from doing this, although
             | you may have to wrap it in an unsafe block.
        
         | fortran77 wrote:
         | My feelings, too. I do both embedded programming (in C/C++ and
         | CUDA) and Erlang programming for a living. Erlang, too, doesn't
         | let you (easily) write your own linked list structures, etc.
         | 
         | But for embedded programming with tight memory or performance
         | constraints these data structures are essential so we use C++
         | or even C. They're well understood and the implementations have
         | simple, elegant solutions.
         | 
         | For "safety" when we don't need absolute control, we'll choose
         | a GC language like C# or F#. No need for the complication of
         | Rust.
        
           | saagarjha wrote:
           | Rust gives you just as much control as C when you need it.
        
             | throwaway17_17 wrote:
             | I may be mistaken, but if using unsafe does not allow for
             | the 'borrow-checker' to be turned off and allow for code
             | which doesn't abide by the checker's requirements, then it
             | clearly does not give "as much control as C". Again, I
             | might have missed some of the subtleties of circumventing
             | Rust's static checking, but I don't think I can purposely
             | create some non-deterministic racy-appearing abstractions,
             | which would be trivial to do using global variables in C.
        
               | nicoburns wrote:
               | You can't turn off the borrow checker for references, but
               | Rust also provides raw pointers which are not subject to
               | borrow checking (these are exactly like pointers in C,
               | and can be cast to and from references (this is a no-op
               | at runtime since they share the same memory
               | representation)).
               | 
               | https://doc.rust-lang.org/1.30.0/book/first-edition/raw-
               | poin...
               | 
               | You can create and manipulate raw pointers in safe code,
               | but dereferencing them requires an `unsafe` block.
        
               | zozbot234 wrote:
               | AIUI, there are some hardware architectures where even
               | _creating_ a wild pointer might be undefined behavior,
               | regardless of whether that pointer is subsequently
               | dereferenced, and C inherits these requirements. This
               | means that it might be desirable to restrict creation and
               | manipulation of raw pointers to unsafe code in Rust as
               | well, if this can be done without introducing undue
               | incompatibilities.
               | 
               | (Rust editions would naturally allow for this: Rust 2021
               | would warn on creating/manipulating raw pointers in Safe
               | Rust, and stop warning for "unnecessary" use of unsafe
               | deriving from these operations; Rust 2024 would make
               | these a hard error ouside `unsafe`.)
        
               | aw1621107 wrote:
               | > AIUI, there are some hardware architectures where even
               | creating a wild pointer might be undefined behavior,
               | regardless of whether that pointer is subsequently
               | dereferenced
               | 
               | Would you be able to point me to some references for such
               | hardware? Im not sure how that would work (at least based
               | on my admittedly limited amount of experience). Wouldn't
               | a pointer look like any other integer right up until it's
               | used as a memory operand? Or would said architecture have
               | some way to distinguish pointers and a "regular" integer
               | in registers?
        
               | Hello71 wrote:
               | Allegedly, some platforms have pointer trap
               | representations, where a certain pointer can be created,
               | but may not be used in any operations of that type. No
               | modern systems have such trap representations for pointer
               | types, but the C standard inherits their legacy, and,
               | more importantly, C compilers use it as justification for
               | certain types of optimizations. Since it's not a hardware
               | limitation, Rust can perfectly well take the opposite
               | path and say that the compiler may not use it for those
               | optimizations.
               | 
               | https://stackoverflow.com/questions/6725809/trap-
               | representat...
        
               | zozbot234 wrote:
               | > No modern systems have such trap representations for
               | pointer types
               | 
               | This may be incidentally true, but "address
               | sanitizer"-like features are becoming more common on
               | modern hardware, and while these do not currently trap on
               | _creation_ / _manipulation_ of a  'wild' pointer (since,
               | strictly speaking, a trap only happens on dereferencing),
               | there's no solid reason to expect this to remain the case
               | in the future.
        
               | int_19h wrote:
               | I don't see how you could trap creation or manipulation,
               | since those pointers are stored in registers and/or
               | memory, and both are fundamentally untyped. How would the
               | hardware even know that something is a pointer, on any
               | architecture that is popular today?
        
               | Dylan16807 wrote:
               | Maybe. Or maybe you'd just change it so 'wild pointers'
               | created in safe code are stored as ints on that
               | architecture.
        
               | fortran77 wrote:
               | It's sad that we have to create throwaway accounts if we
               | want to be able to simply say we prefer or use other
               | languages over Rust without getting downvoted so much
               | that we'll end up shadowbanned.
        
               | [deleted]
        
               | saagarjha wrote:
               | > if using unsafe does not allow for the 'borrow-checker'
               | to be turned off and allow for code which doesn't abide
               | by the checker's requirements
               | 
               | It does, that's why it's there.
        
               | steveklabnik wrote:
               | Unsafe doesn't turn off the borrow checker, but it does
               | give you access to pointers that aren't checked.
        
               | zozbot234 wrote:
               | > if using unsafe does not allow for the 'borrow-checker'
               | to be turned off and allow for code which doesn't abide
               | by the checker's requirements
               | 
               | It allows the latter. 'Code that doesn't abide by the
               | checker's requirements' uses separate facilities that are
               | only allowed in unsafe code. This means that `unsafe`
               | doesn't _have_ to turn off anything, and further
               | pinpoints the parts of the code where caution is needed
               | in order to maintain the invariants that Safe Rust is
               | based on.
        
           | terminaljunkid wrote:
           | In before someone from rust evangelism strike force chimes in
           | saying rust is actually very simple and borrow checker takes
           | 2 days to get familiar.
        
         | [deleted]
        
         | steveklabnik wrote:
         | It explicitly says that there are good use cases, just that
         | they're very rare.
        
           | ajross wrote:
           | It does not. Maybe somewhere else it does. That section,
           | verbatim, says:
           | 
           |  _It 's niche. You're talking about a situation where you're
           | not even using your language's runtime. Is that not a red
           | flag that you're doing something strange?
           | 
           | It's also wildly unsafe.
           | 
           | But sure. Build your awesome zero-allocation lists on the
           | stack._
           | 
           | And this is (1) wildly mistating the requirements and (2)
           | deeply offensive to those of us who work in those regimes.
           | 
           | Obviously, yes, there's room for a reasoned argument about
           | this stuff and whether alternative paradigms can be deployed
           | in heapless contexts. This isn't it.
        
             | MaulingMonkey wrote:
             | > And this is (1) wildly mistating the requirements and (2)
             | deeply offensive to those of us who work in those regimes.
             | 
             | Care to expand how? While not a kernel dev, I've had my
             | share of use cases where I've done exactly this kind of
             | thing in gamedev, and I simply cannot bring myself to
             | disagree with what's been said, or see what's offensive.
             | 
             | It's strange, debugging when the pointers get corrupted by
             | other code exhibiting UB is painful, it's a potential
             | multithreading hazard, and flat contiguous arrays are
             | frequently more appropriate - but it's sometimes useful.
             | It's not _arguing_ that an alternative paradigm can - or
             | even should - be deployed in a heapless context. It 's
             | explicitly admitting that intrusive linked lists are an
             | appropriate paradigm.
        
             | steveklabnik wrote:
             | > It's a fine data structure with several great use cases,
             | but those use cases are exceptional, not common.
        
             | catatattat wrote:
             | > (2) deeply offensive to those of us who work in those
             | regimes
             | 
             | You know how sometimes someone who is A Little Too Online
             | gets offended because they think you said a Bad Thing, but
             | it was actually just a typo or a straight misreading, and
             | you try to explain that they're reacting to something you
             | didn't even say let alone believe, but because they have
             | already decided you are the Bad Person they interpret that
             | as you "doubling down" on the bad opinion that they
             | attributed to you, so they get even angrier and even more
             | convinced that you sincerely believe the Bad Thing?
             | 
             | I mean no judgment. We all have such sensitivities. But
             | maybe now you see how easily you can wind up on the wrong
             | side of a public debate by searching for offense where
             | there is in fact none.
        
         | roblabla wrote:
         | The actual context of your quote:
         | 
         | > Just so we're totally 100% clear: I hate linked lists. With a
         | passion. Linked lists are terrible data structures. Now of
         | course there's several great use cases for a linked list: > > -
         | You're writing a kernel/embedded thing and want to use an
         | intrusive list.
         | 
         | So I've got no clue what you're railing about. The project
         | specifically acknowledges that there is a need in kerneldev for
         | those data structures.
         | 
         | I'm a kernel dev using Rust for my kernel. I use both intrusive
         | linked list and growable vectors in it. The thing is, the
         | sentiment expressed in the article really resonates in me: in
         | most cases, growable vectors are a better choice, performance-
         | wise.
        
           | throwaway17_17 wrote:
           | The quote that the GP is talking about is included below,
           | which copy/pasted from the project page, and the Mumble
           | mumble line is the heading for a paragraph:
           | 
           | '''Mumble mumble kernel embedded something something
           | intrusive.
           | 
           | It's niche. You're talking about a situation where you're not
           | even using your language's runtime. Is that not a red flag
           | that you're doing something strange?
           | 
           | It's also wildly unsafe.'''
           | 
           | Also, prior to this author claims the following, where the
           | first line also a section heading:
           | 
           | ''' I can't afford amortization
           | 
           | You've already entered a pretty niche space'''
           | 
           | These fiat rulings based on one an authors generalization of
           | what is 'niche' are what I assume GP was commenting on. These
           | are the kinds of dismissals that some developers take issue
           | with, as GP states.
        
             | saagarjha wrote:
             | Kernel development _is_ niche. Most of the time you don't
             | need linked lists.
        
               | varjag wrote:
               | When all you have is Rust everything starts to look like
               | adjustable array.
        
               | fluffy87 wrote:
               | Rust has singly linked, double linked ( in its minimal
               | std library) and intrusive linked lists - all with APIs
               | that guarantee a lack of memory errors and data races at
               | compiler. I don't know any good kernel developer that
               | wouldn't like those guarantees and I work in the Linux
               | kernel professionally full time.
        
               | saagarjha wrote:
               | What's an adjustable array? A vector?
        
               | varjag wrote:
               | A vector is a one-dimensional array. Adjustable arrays
               | can have arbitrary number of dimensions, although am not
               | sure if it's a thing in Rust.
        
               | staticassertion wrote:
               | That just sounds like a graph.
        
               | varjag wrote:
               | How in the heaven an array can sound like a graph?
        
               | staticassertion wrote:
               | Sounds like one to me.
        
               | dpbriggs wrote:
               | I've spent years programming rust and I'm not sure what
               | you mean. Do you mean arrays of tuples, or nested arrays?
        
               | varjag wrote:
               | I mean a multi-dimensional array.
        
               | dpbriggs wrote:
               | Thanks! Why does everything look like a multi-dimensional
               | array for you in rust?
        
               | varjag wrote:
               | An array (an ADT) can be arbitrary dimensional, with one-
               | dimensional case often called vector (and two-dimensional
               | called matrix). An array can also be adjustable, both in
               | one and multi-dimension variants.
               | 
               | So a vector can be both adjustable and non adjustable,
               | and some language do have both versions. Some language
               | have adjustable one-dimensional array/vector as the only
               | dynamic aggregate/ordered datatype.
               | 
               | And to answer the post I replied to originally, a vector
               | in rust is a one-dimensional adjustable array. To answer
               | your question above, no that's not what I meant, sorry
               | for not being clear enough from the beginning.
        
               | dpbriggs wrote:
               | Thanks for clarifying.
               | 
               | I think we're just using different definitions. The only
               | real definition of an array I've encountered in work and
               | school is that it's just a one dimensional collection of
               | elements, usually of static size. A vector depending on
               | context is usually the same thing as an array but you
               | conveniently change the size dynamically. Lists can
               | whatever you need it to be given the context, just needs
               | to be sequential.
               | 
               | Of course you can represent higher dimension structures
               | by linearizing indices (x + row_size * y, etc).
               | 
               | I think people are getting confused as most don't
               | consider arrays to be arbitrarily dimensional without
               | some scheme.
               | 
               | Completely off topic but you've reminded me of this great
               | article: https://hypirion.com/musings/understanding-
               | persistent-vector...
        
             | imtringued wrote:
             | If you can't tell when to break or not break rules then
             | maybe you are not a kernel developer? It's almost like a
             | King only following his own laws instead of being the one
             | making them. Why are you a King again?
        
           | ajross wrote:
           | (Had to edit in a quote to what I was replying to because you
           | fixed the original):
           | 
           | > The actual context of your fake quote:
           | 
           | Good grief, that was a verbatim quote! Here's a link to the
           | exact text I quoted:
           | 
           | https://rust-unofficial.github.io/too-many-lists/#mumble-
           | mum...
        
         | [deleted]
        
         | jszymborski wrote:
         | I think it must be stressed that this is an unofficial guide
         | and not "the voice of the project", etc...
        
       | jasondclinton wrote:
       | I went through this a few years ago for fun. It's a great guided
       | tour of the compiler errors when working with complex memory
       | safety needs. The most important thing to know about these is
       | that you would almost certainly never do these in a real project:
       | lists are in `std` but even then the vast majority of cases
       | should use `Vec`.
        
         | Dylan16807 wrote:
         | Linked lists are inherently niche on modern hardware.
        
           | senderista wrote:
           | Yes, along with binary trees, but there are cache-friendly
           | alternatives like unrolled linked lists and B-trees.
        
           | Animats wrote:
           | Yes. Lists were fine when memory was random-access. Today, if
           | you're linking all over memory, cache misses dominate
           | performance. So contiguous data structures are preferred.
        
             | chapium wrote:
             | This isn't a linked list issue at all, but a problem with
             | constructing the nodes. If you construct list nodes in a
             | contiguous location cache misses are a non issue.
        
               | estebank wrote:
               | That's no longer a linked list, it's a vector.
        
               | [deleted]
        
               | mikepurvis wrote:
               | I think the argument is that if the nodes are fixed size
               | then you can preallocate a big block of them slab-style,
               | and for the pointers you just use indexes into that array
               | rather than "real" pointers.
               | 
               | Potentially there's a small memory savings as well if you
               | can use 2 or 4 bytes for the index instead of a full 8
               | byte pointer. But you're taking on a lot of complexity
               | and tuning by going this route so you'd really have to
               | test and make sure the gains were worth it.
        
               | mikepurvis wrote:
               | But that's fundamentally a function of the usage pattern.
               | If the usage is "construct a bunch of nodes" and then
               | nothing, then you should be using a vector-type structure
               | anyway.
               | 
               | The promise of a linked list is being able to iterate and
               | make constant-time insertions/deletions wherever you
               | want. But the more such mutations occur, the more
               | fragmented your memory will end up, and the more you'll
               | get hit by the cache issues--especially if your list is
               | large, exactly where the purported benefits of a linked
               | list are strongest.
        
           | varjag wrote:
           | Seriously? Are trees also niche?
        
             | Dylan16807 wrote:
             | What kind of tree? A binary tree isn't great for many uses,
             | and doesn't excel at much. But a nice fat B+ tree removes
             | the vast majority of pointer-chasing latency.
        
               | kybernetikos wrote:
               | I've got quite keen on B+ trees. It seems like the world
               | where we had fast main memory and slow drive storage is
               | not very different to the world where we have fast cache
               | memory and slow main memory.
        
             | Tuna-Fish wrote:
             | Yes. For the vast majority of uses, you should probably use
             | a hash table instead.
             | 
             | There are still niche uses for both linked lists and trees,
             | but the sad fact is that the relative time it takes to
             | chase a random pointer compared to doing _literally
             | anything else_ keeps getting worse, and is already at the
             | point where frequently linearly copying kilobytes of arrays
             | is almost always faster than using a linked list.
        
               | int_19h wrote:
               | A hash table is vulnerable to collision attacks, if keys
               | are derived from any kind of untrusted external input.
               | Trees might be slow, but they're _consistently_ slow.
        
               | deckard1 wrote:
               | non sequitur. No one mentioned security, nor do any
               | elementary data structures concern themselves with such
               | higher level things. If you're allowing unlimited
               | untrusted data to control internal data structures,
               | you're going to have a bad time regardless of which data
               | structure you're using.
        
               | eximius wrote:
               | And rusts default hashing implementation is hardened
               | against these attacks
        
               | btilly wrote:
               | _...time it takes to chase a random pointer compared to
               | doing literally anything else..._
               | 
               | When I have had to do serious stuff with linked lists,
               | trees, and so on, I found that it was much, much faster
               | to assign an array of nodes, and allocate the linked list
               | out of that close together. There was a lot of complexity
               | in doing so but colocating data to match my access
               | pattern was a big win.
        
               | varjag wrote:
               | If your data/algorithm calls for a certain dynamic
               | structure, let's say a red-black tree, replacing it with
               | array is pointless.
               | 
               | Last time I looked, kobjects in Linux kernel were still
               | dynamically linked in a bunch of ways, and tree-like data
               | structures are widely used.
        
               | staticassertion wrote:
               | I think it's probably fair to call kernel development
               | niche? And the fact that Linux uses linked lists a lot is
               | not necessarily a strong case for linked lists, but just
               | a matter of very specific constraints - like perhaps not
               | wanting to optimize too much for underlying hardware?
        
               | varjag wrote:
               | I used Linux kernel as an example of easily accessible,
               | extremely well known, performance tuned open source
               | project. Of course dynamic data structures are used in
               | countless other applications, like DBMS, web servers and
               | web browsers. Really in nearly any non-trivial codebase.
        
               | adrianN wrote:
               | Sure, but algorithms that explicitly need a linked
               | structure for performance are very few compared to
               | algorithms that have the same or better asymptotic with a
               | hash and a vector or for which the input sizes are such
               | that the large constants pointer chasing causes don't
               | make up for worse asymptotics.
        
               | varjag wrote:
               | If the algorithm calls for a balancing tree, you'd end up
               | re-implementing it across of bunch of (linked) hash
               | tables or over an array treated like a heap. Explicitly,
               | or if one does not know what they are doing, implicitly.
               | In either case you would have no performance advantage.
        
               | btilly wrote:
               | If you know what you are doing, you may well realize a
               | performance advantage.
               | 
               | For example LevelDB (which is conceptually based on
               | Google BigTable's design) looks a lot like a balancing
               | tree in its access patterns, but is a lot faster than a
               | variety of alternatives. And it is faster in part because
               | sequential data is stored sequentially in order instead
               | of using a data structure that results in random access
               | patterns.
               | 
               | And no. It doesn't use linked lists.
        
             | yazaddaruvala wrote:
             | Trees and Lists are types of data structures. Both are very
             | useful in all contexts.
             | 
             | Linked List is a special kind of List. It typically implies
             | a non-sequential data layout.
             | 
             | With all of the modern layers of abstraction, non-
             | sequential memory access typically means poor cache
             | friendliness, i.e. poor performance.
             | 
             | Hopefully that helps explain why Linked Lists are
             | considered niche, I.e. specific to embedded programming or
             | in very special cases when benchmarks provide hard data to
             | use a Linked List.
        
               | varjag wrote:
               | Tree is a special case of a linked list.
               | 
               | Linked list is definitely sequential (it's a list!), and
               | it can even be sequentially allocated in memory,
               | depending on allocator implementation.
        
               | dpbriggs wrote:
               | That's not necessary true. You can implemented binary
               | trees as an array, so are trees a special case of arrays?
               | 
               | There can be sequential and non-sequential
               | implementations of ADTs.
        
             | saagarjha wrote:
             | B-trees aren't too bad.
        
         | monadic2 wrote:
         | > The most important thing to know about these is that you
         | would almost certainly never do these in a real project: lists
         | are in `std` but even then the vast majority of cases should
         | use `Vec`.
         | 
         | Not relevant to rust, no?
        
           | steveklabnik wrote:
           | That is true of Rust. The linked list is bad though, even
           | irrespective of this argument about the utility of lists.
        
         | grok22 wrote:
         | I am learning Rust due to it's memory safety features and
         | comments like this rub me the wrong way for some reason -- they
         | give me an uneasy feeling that I will run into unexpected
         | limitations in the language because these corners are not
         | exercised enough. Because people who know the language are
         | saying that the kind of data-structures that I deal with day in
         | and out in my day-to-day work are not the "preferred" thing to
         | do.
         | 
         | I do system software/networking software and I deal with a lot
         | of linked lists and trees and these comments make me feel that
         | Rust may not be a good language for my use-case in-spite of me
         | wanting the memory safety features.
        
           | mkesper wrote:
           | It's just not needed to implement trivial data structures by
           | yourself if it's already implemented before. Also linked
           | lists might be sub-optimal, see other comments.
        
           | jdsully wrote:
           | The comment applies word for word to C++ as well. Linked
           | lists are just bad datastructures for modern CPUs.
           | 
           | I don't use rust so I can't speak to the quality of the std
           | lib, but I don't think its fair to use the parent comment as
           | evidence either way.
        
           | pas wrote:
           | Do you currently use C? C++ perhaps?
           | 
           | I mean basically no other language that currently comes to
           | mind fiddles so much with the basics.
           | 
           | In Rust, just as in Java, people write highly optimized safe
           | and fast data structures and others build upon that.
           | 
           | In C/C++ it seems every project reinvents the wheel to a
           | large degree.
           | 
           | Or am I mistaken?
        
             | bobbiechen wrote:
             | >In Rust, just as in Java, people write highly optimized
             | safe and fast data structures and others build upon that.
             | 
             | I'm currently taking a class on API design with Josh Bloch
             | (Java Collections, Effective Java, etc.), who pointed out
             | that this wasn't the case until the late 90s or so - he
             | pulled out an example of a KWIC system [1] described in a
             | paper [2] from 1971:
             | 
             | Parnas 1971: _This is a small system [that] could be
             | produced by a good programmer within a week or two._
             | 
             | And then Josh proceeded to show his implementation of the
             | same system, which he had written in <2 hours just by
             | making use of the built-in collections, regular
             | expressions, sorting, string formatting, etc.
             | 
             | It's really cool that these standard data structure
             | implementations exist and are so accessible, making people
             | orders of magnitude faster than before.
             | 
             | [1] https://en.wikipedia.org/wiki/Key_Word_in_Context [2]
             | https://prl.ccs.neu.edu/img/p-tr-1971.pdf
        
             | grok22 wrote:
             | I use C. And you are right, in my field there is a tendency
             | to a large degree to re-invent the wheel of data-structures
             | :-). Sometimes justified, but most times not. But mostly
             | because C doesn't usually have well-known, industry-
             | standard libraries for common data-structures. Or even if
             | they do, it's kind of trivial to implement basic data-
             | structures (not saying they will be bug-free!) instead of
             | relying on some random distribution of a library from the
             | Internet.
             | 
             | BTW, most times, the problem is not performance, but tight
             | control of memory usage.
        
             | int_19h wrote:
             | It is definitely not the case for C++. It has a decent
             | standard library - when it comes to collections, richer
             | than many other languages, in fact - and then there's Boost
             | for more exotic needs.
        
           | dodobirdlord wrote:
           | Linking my comment from elsewhere.
           | 
           | https://news.ycombinator.com/item?id=22393723
        
           | btilly wrote:
           | Don't worry. The language has lots of support for the style
           | you wish to work in.
           | 
           | As for the specific data structures, I recommend setting
           | aside your knowledge of their utility and consider carefully
           | why they are being criticized. Then run some experiments to
           | convince yourself of what is true. If you come to the
           | conclusion that the criticisms make sense, then you can
           | proceed with the knowledge that the people in question have
           | thought about the issue and came to good decisions. This
           | should inspire hope that they have thought about other issues
           | as well and may have valuable insights.
           | 
           | For the record, I concluded many years ago that some variant
           | on arrays/vectors/whatever outperforms linked lists in most
           | situations by a lot. (There are times when
           | arrays/vectors/whatever can't work. But they are less common
           | than most imagine.) And changes in hardware have made that
           | more true over time. Trees, on the other hand, have a
           | flexibility that lends itself to many problems that it is
           | hard to find other structures for many use cases. But even so
           | if performance is critical, it is worth jumping through a lot
           | of hoops to make sure that all of the pointers that make up a
           | tree are physically close together in memory to improve the
           | efficiency of your caches.
        
       ___________________________________________________________________
       (page generated 2020-02-22 23:00 UTC)