[HN Gopher] Learn Rust with entirely too many linked lists (2019) ___________________________________________________________________ Learn Rust with entirely too many linked lists (2019) Author : pcr910303 Score : 260 points Date : 2020-02-22 12:29 UTC (10 hours ago) (HTM) web link (rust-unofficial.github.io) (TXT) w3m dump (rust-unofficial.github.io) | brundolf wrote: | When I was first getting started with Rust this tutorial was eye- | opening. It gave me a clear view into the somewhat impenetrable | world of working with Boxes (pointers to the heap) directly, in | the context of the borrow-checker. | | It also convinced me that you usually just want to use the | standard library data structures if you can :P | xlap wrote: | Basically anything published about Rust is turning me away from | the language. | | It could be unfair though: Technical writing (and that includes | humorous opinionated pieces) has declined dramatically in the | last 10 years. | | Or perhaps writing as a whole has declined. | Dylan16807 wrote: | It wouldn't hurt to be specific. | zackify wrote: | Can't wait to follow this. I've been going through the official | book these past couple weeks. As someone who's never used systems | languages (mainly a js / node person) I was able to write a | redis-like database quickly on top of TCP. I've also picked up | async / await and how it works in rust. | | There's so much to learn but the compiler is surprisingly | helpful. I loved Typescript and it is like TS on steroids. When I | get stuff compiling I have so much confidence. | SAI_Peregrinus wrote: | Lots of these comments seem confused about the purpose of this. | It's not about using linked lists (that's easy, they're in | std::collections), it's about implementing them. So the common | non-kernel/embedded use cases are well supported, just use | std::collections::LinkedList! But if you're in a no-std context | then you might need this. | caconym_ wrote: | A while ago now, this was the tutorial that got me to the point | where I could implement linked structures in Rust. Highly | recommended. | kybernetikos wrote: | I really value this guide. | | Writing data structures in a language is one of the standard ways | that I convince myself that I understand it. Rust really messes | hard with that mindset. | | This guide talked me through all the things I'm not used to | worrying about in garbage collected languages, showed me the | error messages that I'd been banging my head against, explained | what they meant, and did so with humor. | winrid wrote: | I'll try this at some point. My first couple forays into Rust | made me miss Java and C. | wcrichton wrote: | Another great space of exercises in this vein are in-place | operations on binary trees. | | Specifically, rebalancing [1] proved particularly tricky for my | students. I think it's a good litmus test for whether you | understand ownership, borrowing, and algebraic data types. | | [1] http://cs242.stanford.edu/f19/assignments/assign6/#22-bst- | in... | matthewaveryusa wrote: | >You're doing some awesome lock-free concurrent thing. | | I can confirm lists are pretty rarely used. The four times I've | used a linked list in 10 years of professional programming were: | | -quick&dirty hashmap in C | | -threadsafe queue | | -keeping track of a set of objects that can't be copied ( | threads) | | -LRU cache | | In C++ at least, the nice thing about a list is you can push/pop | in the front/back without a reallocattion or invalidating | iterators. | chongli wrote: | Linked lists are used a lot in game programming where the | worst-case behaviour of std::vector and similar structures is | undesirable. Linked lists may be slow for a variety of reasons | but they're simple and predictable which is very nice when | you're trying not to miss the 16.67ms per-frame window. | SAI_Peregrinus wrote: | Used, yes, butbnot implemented. std::collections::LinkedList | is a doubly-linked list suitable for games. It's basically | just the no-std crowd that might need to implement their own. | estebank wrote: | I thought it was the other way around: games using arrays, | non arbitrarily growable vectors to precisely be able to | iterate over everything quickly in a deterministic fashion. | Specially because a lot of what games have to deal with is | "visit every node". | chongli wrote: | They use arrays for static vertex, texture, etc. data to | send to the GPU. They don't use arrays for things that are | being created and destroyed all the time, such as units in | a strategy game, they use lists for those things. | estebank wrote: | But do they use linked lists for them? I would have | assumed you would use an arena for something like that, | precisely because you already need to iterate over all of | them and can represent any graph as keys into a | generational index, so that reading stale keys is | trivially handled. | | Again, I'm not a game dev and this is just my | understanding after reading up on these topics after | watching https://youtu.be/aKLntZcp27M. | big_chungus wrote: | I've used linked lists a little more because I mostly do C | rather than C++. Sometimes I end up using it as a quick-and- | dirty vector when I don't want to bother writing something or | using a library, though I typically use some kind of | block/arena allocator vs. just malloc. | Animats wrote: | Doubly-linked lists are hard with Rust's ownership system. I've | argued for Rust having backpointers as a built-in type the borrow | checker understands. You have to maintain the invariant that if A | points to B, B points back to A. A is an owning pointer, and B is | a non-owning pointer locked in a relationship with A. Easy to | check if the language lets you say that's what you're doing. Rust | lacks that. | | If you have safe backpointers, most tree-type data structures | with backpointers can be constructed. A nice feature to have. | magicalhippo wrote: | As a non-Ruster, any particular reason they didn't include | this? A lot of the code I've done over the years involve | graphs, which have a lot of circular pointer and/or | backpointers. I find it kinda weird they'd make such a common | thing a PITA in a new language. | eximius wrote: | Rust has a concept of ownership and borrowing. (I think) | everything is fine when you just have immutable references | everywhere. But as soon as you start taking mutable | references (of which only one can exist at a time) or | ownership (stronger than mutable references), then cycles | break things. | | You can get around it with Interior Mutability, it's just | slightly more verbose. | | Also, it isn't really that they made an easy thing hard. | Properly maintaining the invariants with cyclical references | is hard and unsafe. Rust exposes that difficulty in a very | literal way | dodobirdlord wrote: | Rust's ownership rules aren't for the purpose of making the | user's life hard, they are the least restrictive system that | could be devised that allow the compiler to uphold Rust's | safety guarantees. It was not too long ago that it was common | knowledge that a programming language either has a garbage | collector or has manual memory management, or possible both. | Safe Rust has neither, but not without the effect of making | awkward certain kinds of programming that are fundamentally | difficult to make safe. | | The Rust question to ask here is: Do you really need | _pointers_ , specifically? Or would some other reference | mechanism work? You could put all of your graph nodes into a | linear data structure like an array, and have them point to | each other by holding a list of indices instead of a list of | pointers. Or you could give all of your nodes unique keys and | keep them in a table, and have them hold references to each | other by key. The compiler will not try to prove the | correctness of your graph algorithm, and in the event of | programmer error that leads to dangling references your | program will have to handle the scenario of following an | index or a key and not finding a value, so bugs will not | introduce memory unsafety. | | There's also ongoing work on memory arenas in the nightly | compiler, and I believe some libraries. Putting a graph into | an arena is a good way to appease the compiler, because the | entire arena will be freed at the same time, ensuring that | hanging pointers will never exist between graph nodes. | lrem wrote: | In your graph, the nodes are not owning each other, are they? | magicalhippo wrote: | No, but they do point to each other. For example, each node | can have an array of all the edges, and each edge has two | pointers to each node it connects. Via this you can walk | around cycles for example. And typically these can change | when nodes are inserted or removed. | Animats wrote: | This is an enforced invariant. This results in proposals that | it should be done by adding a system for general invariants | and design by contract features. Those are then rejected as | too complicated and something for the far future. | | So, the usual answer is "use unsafe code." What could | possibly go wrong? | moth-fuzz wrote: | Historically, the classic graph structures are adjacency | lists and adjacency matrices. Those are quite easy to | represent in Rust if you use indices instead of pointers. | Node/Pointer type graph structures aren't particularly | efficient anyway, even in C or C++. | habitue wrote: | It might actually be possible now with Pin. Could anyone | actually knowledgeable about this comment? | renox wrote: | I don't know much about Rust (and nearly nothing about Pin) | but if this was true, don't you think someone would already | have written a library providing doubly linked list without | using unsafe? | dpbriggs wrote: | You can somewhat emulate this with Weak pointers: | | https://doc.rust-lang.org/std/rc/struct.Weak.html | | They don't count against ownership, but do bring some extra | headaches of their own (referencing counting overhead, etc). | ubercow13 wrote: | This is great as an intro to Rust, I preferred this as a Rust | starter tutorial to the main rust book or any other tutorial I | tried | Waterluvian wrote: | I'm loving both. The book is comprehensive and teaches you even | the most basic concepts, so it's great for a broader skill | range. | | But I love this one too because it digs into some CS | archaeology and that helps me dig into the theory and history. | I doubt I'll ever have to implement a linked list but knowing | how they work and are implemented and their advantages and | drawbacks is great. | | Tangentially, there's a wonderful game called Human Resource | Machine which teaches linked lists and other assembly-like | programming without you even realising it. | ubercow13 wrote: | The book is very comprehensive but it didn't click for me as | a beginner, I found much of the exposition left me with | unanswered questions while it went on to cover more ground. | I'm not sure if those questions were answered later but I got | lost very quickly. Maybe it's aimed at people with more C++ | background? | | This linked list tutorial answered every question I thought | of almost exactly as I thought of them, which made it a joy | to read. I also liked Rust By Example [1] over the book. | Based on my experience I'd recommend this tutorial and then | implementing something using Rust By Example as reference. | But everyone learns differently! | | [1] https://doc.rust-lang.org/rust-by-example/ | geraldbauer wrote: | A little more light-hearted but the same style to learn Ruby with | Entirely Too Many Fizz Buzzes. See | https://yukimotopress.github.io/fizzbuzz | ajross wrote: | > Mumble mumble kernel embedded something something intrusive. | | And this is why kernel/embedded/something/something developers | don't take Rust as seriously as you want them to. | | You can't simultaneously declare your language the best choice | for system software development _and_ treat the long-evolved | patterns of those paradigms as a joke. | | There are very good reasons for intrusive data structures, not | least of which being their ability to operate in contexts where | no heap is available. If you don't understand them or don't want | to talk about them or want to limit your discussion to situations | with different requirements, then say so. | jasondclinton wrote: | https://github.com/Amanieu/intrusive-rs implements what you're | looking for with no heap. There's no need to flame or | misrepresent what the book says (no one said embedded was a | joke). The Rust embedded community is strong. | ajross wrote: | You're like the sixth person to argue I'm somehow | "misrepresenting" what is being said in this book? I'm | quoting it directly. The "flame" is in the original, and I'm | responding to it. | | Take it out. It's bad. | jasondclinton wrote: | Assuming that you're not trolling, I'll explain how you | misinterpreted what you read: the section that you quoted | is a sub-heading under: | | > An Obligatory Public Service Announcement | | which is an argument that: | | > Linked lists are as niche and vague of a data structure | as a trie. | | The author then goes on to state that many people have | contacted him to argue that linked lists are not niche and | he puts each of those arguments under a heading: | | > Mumble mumble kernel embedded something something | intrusive. | | is one such heading. Inside of it, he continues the | argument that _linked lists for embedded no-heap scenarios | are niche_ and--to continue his argument--we therefore | shouldn 't be teaching undergrads linked lists just like we | don't teach them tries. | robbyt wrote: | To explain what I think this comment means. When working on | embedded systems, you can interact with hardware devices by | writing directly to specially mapped memory areas. | | E.g., if you want to write text to a small screen, the kernel | driver gives you a memory region that you write bytes to, and | they're shown on the screen immediately, without requiring the | CPU. | monocasa wrote: | They're saying more that intrusive data structures are really | nice in situations that restrict heap allocation. The memory | overhead outside of the structures linked together is O(1), | because all of the per object metadata is stored in the | objects themselves. That means consumers generally don't have | to be hardcoded for certain number of objects like you would | for an array/vector that doesn't have access to a heap. | redrobein wrote: | > To explain what I think this comment means. When working on | embedded systems, you can interact with hardware devices by | writing directly to specially mapped memory areas. | | And why can't you do this in rust? | jasondclinton wrote: | You can. Rust allows full memory addressing. | lawn wrote: | There's nothing to prevent you from doing this, although | you may have to wrap it in an unsafe block. | fortran77 wrote: | My feelings, too. I do both embedded programming (in C/C++ and | CUDA) and Erlang programming for a living. Erlang, too, doesn't | let you (easily) write your own linked list structures, etc. | | But for embedded programming with tight memory or performance | constraints these data structures are essential so we use C++ | or even C. They're well understood and the implementations have | simple, elegant solutions. | | For "safety" when we don't need absolute control, we'll choose | a GC language like C# or F#. No need for the complication of | Rust. | saagarjha wrote: | Rust gives you just as much control as C when you need it. | throwaway17_17 wrote: | I may be mistaken, but if using unsafe does not allow for | the 'borrow-checker' to be turned off and allow for code | which doesn't abide by the checker's requirements, then it | clearly does not give "as much control as C". Again, I | might have missed some of the subtleties of circumventing | Rust's static checking, but I don't think I can purposely | create some non-deterministic racy-appearing abstractions, | which would be trivial to do using global variables in C. | nicoburns wrote: | You can't turn off the borrow checker for references, but | Rust also provides raw pointers which are not subject to | borrow checking (these are exactly like pointers in C, | and can be cast to and from references (this is a no-op | at runtime since they share the same memory | representation)). | | https://doc.rust-lang.org/1.30.0/book/first-edition/raw- | poin... | | You can create and manipulate raw pointers in safe code, | but dereferencing them requires an `unsafe` block. | zozbot234 wrote: | AIUI, there are some hardware architectures where even | _creating_ a wild pointer might be undefined behavior, | regardless of whether that pointer is subsequently | dereferenced, and C inherits these requirements. This | means that it might be desirable to restrict creation and | manipulation of raw pointers to unsafe code in Rust as | well, if this can be done without introducing undue | incompatibilities. | | (Rust editions would naturally allow for this: Rust 2021 | would warn on creating/manipulating raw pointers in Safe | Rust, and stop warning for "unnecessary" use of unsafe | deriving from these operations; Rust 2024 would make | these a hard error ouside `unsafe`.) | aw1621107 wrote: | > AIUI, there are some hardware architectures where even | creating a wild pointer might be undefined behavior, | regardless of whether that pointer is subsequently | dereferenced | | Would you be able to point me to some references for such | hardware? Im not sure how that would work (at least based | on my admittedly limited amount of experience). Wouldn't | a pointer look like any other integer right up until it's | used as a memory operand? Or would said architecture have | some way to distinguish pointers and a "regular" integer | in registers? | Hello71 wrote: | Allegedly, some platforms have pointer trap | representations, where a certain pointer can be created, | but may not be used in any operations of that type. No | modern systems have such trap representations for pointer | types, but the C standard inherits their legacy, and, | more importantly, C compilers use it as justification for | certain types of optimizations. Since it's not a hardware | limitation, Rust can perfectly well take the opposite | path and say that the compiler may not use it for those | optimizations. | | https://stackoverflow.com/questions/6725809/trap- | representat... | zozbot234 wrote: | > No modern systems have such trap representations for | pointer types | | This may be incidentally true, but "address | sanitizer"-like features are becoming more common on | modern hardware, and while these do not currently trap on | _creation_ / _manipulation_ of a 'wild' pointer (since, | strictly speaking, a trap only happens on dereferencing), | there's no solid reason to expect this to remain the case | in the future. | int_19h wrote: | I don't see how you could trap creation or manipulation, | since those pointers are stored in registers and/or | memory, and both are fundamentally untyped. How would the | hardware even know that something is a pointer, on any | architecture that is popular today? | Dylan16807 wrote: | Maybe. Or maybe you'd just change it so 'wild pointers' | created in safe code are stored as ints on that | architecture. | fortran77 wrote: | It's sad that we have to create throwaway accounts if we | want to be able to simply say we prefer or use other | languages over Rust without getting downvoted so much | that we'll end up shadowbanned. | [deleted] | saagarjha wrote: | > if using unsafe does not allow for the 'borrow-checker' | to be turned off and allow for code which doesn't abide | by the checker's requirements | | It does, that's why it's there. | steveklabnik wrote: | Unsafe doesn't turn off the borrow checker, but it does | give you access to pointers that aren't checked. | zozbot234 wrote: | > if using unsafe does not allow for the 'borrow-checker' | to be turned off and allow for code which doesn't abide | by the checker's requirements | | It allows the latter. 'Code that doesn't abide by the | checker's requirements' uses separate facilities that are | only allowed in unsafe code. This means that `unsafe` | doesn't _have_ to turn off anything, and further | pinpoints the parts of the code where caution is needed | in order to maintain the invariants that Safe Rust is | based on. | terminaljunkid wrote: | In before someone from rust evangelism strike force chimes in | saying rust is actually very simple and borrow checker takes | 2 days to get familiar. | [deleted] | steveklabnik wrote: | It explicitly says that there are good use cases, just that | they're very rare. | ajross wrote: | It does not. Maybe somewhere else it does. That section, | verbatim, says: | | _It 's niche. You're talking about a situation where you're | not even using your language's runtime. Is that not a red | flag that you're doing something strange? | | It's also wildly unsafe. | | But sure. Build your awesome zero-allocation lists on the | stack._ | | And this is (1) wildly mistating the requirements and (2) | deeply offensive to those of us who work in those regimes. | | Obviously, yes, there's room for a reasoned argument about | this stuff and whether alternative paradigms can be deployed | in heapless contexts. This isn't it. | MaulingMonkey wrote: | > And this is (1) wildly mistating the requirements and (2) | deeply offensive to those of us who work in those regimes. | | Care to expand how? While not a kernel dev, I've had my | share of use cases where I've done exactly this kind of | thing in gamedev, and I simply cannot bring myself to | disagree with what's been said, or see what's offensive. | | It's strange, debugging when the pointers get corrupted by | other code exhibiting UB is painful, it's a potential | multithreading hazard, and flat contiguous arrays are | frequently more appropriate - but it's sometimes useful. | It's not _arguing_ that an alternative paradigm can - or | even should - be deployed in a heapless context. It 's | explicitly admitting that intrusive linked lists are an | appropriate paradigm. | steveklabnik wrote: | > It's a fine data structure with several great use cases, | but those use cases are exceptional, not common. | catatattat wrote: | > (2) deeply offensive to those of us who work in those | regimes | | You know how sometimes someone who is A Little Too Online | gets offended because they think you said a Bad Thing, but | it was actually just a typo or a straight misreading, and | you try to explain that they're reacting to something you | didn't even say let alone believe, but because they have | already decided you are the Bad Person they interpret that | as you "doubling down" on the bad opinion that they | attributed to you, so they get even angrier and even more | convinced that you sincerely believe the Bad Thing? | | I mean no judgment. We all have such sensitivities. But | maybe now you see how easily you can wind up on the wrong | side of a public debate by searching for offense where | there is in fact none. | roblabla wrote: | The actual context of your quote: | | > Just so we're totally 100% clear: I hate linked lists. With a | passion. Linked lists are terrible data structures. Now of | course there's several great use cases for a linked list: > > - | You're writing a kernel/embedded thing and want to use an | intrusive list. | | So I've got no clue what you're railing about. The project | specifically acknowledges that there is a need in kerneldev for | those data structures. | | I'm a kernel dev using Rust for my kernel. I use both intrusive | linked list and growable vectors in it. The thing is, the | sentiment expressed in the article really resonates in me: in | most cases, growable vectors are a better choice, performance- | wise. | throwaway17_17 wrote: | The quote that the GP is talking about is included below, | which copy/pasted from the project page, and the Mumble | mumble line is the heading for a paragraph: | | '''Mumble mumble kernel embedded something something | intrusive. | | It's niche. You're talking about a situation where you're not | even using your language's runtime. Is that not a red flag | that you're doing something strange? | | It's also wildly unsafe.''' | | Also, prior to this author claims the following, where the | first line also a section heading: | | ''' I can't afford amortization | | You've already entered a pretty niche space''' | | These fiat rulings based on one an authors generalization of | what is 'niche' are what I assume GP was commenting on. These | are the kinds of dismissals that some developers take issue | with, as GP states. | saagarjha wrote: | Kernel development _is_ niche. Most of the time you don't | need linked lists. | varjag wrote: | When all you have is Rust everything starts to look like | adjustable array. | fluffy87 wrote: | Rust has singly linked, double linked ( in its minimal | std library) and intrusive linked lists - all with APIs | that guarantee a lack of memory errors and data races at | compiler. I don't know any good kernel developer that | wouldn't like those guarantees and I work in the Linux | kernel professionally full time. | saagarjha wrote: | What's an adjustable array? A vector? | varjag wrote: | A vector is a one-dimensional array. Adjustable arrays | can have arbitrary number of dimensions, although am not | sure if it's a thing in Rust. | staticassertion wrote: | That just sounds like a graph. | varjag wrote: | How in the heaven an array can sound like a graph? | staticassertion wrote: | Sounds like one to me. | dpbriggs wrote: | I've spent years programming rust and I'm not sure what | you mean. Do you mean arrays of tuples, or nested arrays? | varjag wrote: | I mean a multi-dimensional array. | dpbriggs wrote: | Thanks! Why does everything look like a multi-dimensional | array for you in rust? | varjag wrote: | An array (an ADT) can be arbitrary dimensional, with one- | dimensional case often called vector (and two-dimensional | called matrix). An array can also be adjustable, both in | one and multi-dimension variants. | | So a vector can be both adjustable and non adjustable, | and some language do have both versions. Some language | have adjustable one-dimensional array/vector as the only | dynamic aggregate/ordered datatype. | | And to answer the post I replied to originally, a vector | in rust is a one-dimensional adjustable array. To answer | your question above, no that's not what I meant, sorry | for not being clear enough from the beginning. | dpbriggs wrote: | Thanks for clarifying. | | I think we're just using different definitions. The only | real definition of an array I've encountered in work and | school is that it's just a one dimensional collection of | elements, usually of static size. A vector depending on | context is usually the same thing as an array but you | conveniently change the size dynamically. Lists can | whatever you need it to be given the context, just needs | to be sequential. | | Of course you can represent higher dimension structures | by linearizing indices (x + row_size * y, etc). | | I think people are getting confused as most don't | consider arrays to be arbitrarily dimensional without | some scheme. | | Completely off topic but you've reminded me of this great | article: https://hypirion.com/musings/understanding- | persistent-vector... | imtringued wrote: | If you can't tell when to break or not break rules then | maybe you are not a kernel developer? It's almost like a | King only following his own laws instead of being the one | making them. Why are you a King again? | ajross wrote: | (Had to edit in a quote to what I was replying to because you | fixed the original): | | > The actual context of your fake quote: | | Good grief, that was a verbatim quote! Here's a link to the | exact text I quoted: | | https://rust-unofficial.github.io/too-many-lists/#mumble- | mum... | [deleted] | jszymborski wrote: | I think it must be stressed that this is an unofficial guide | and not "the voice of the project", etc... | jasondclinton wrote: | I went through this a few years ago for fun. It's a great guided | tour of the compiler errors when working with complex memory | safety needs. The most important thing to know about these is | that you would almost certainly never do these in a real project: | lists are in `std` but even then the vast majority of cases | should use `Vec`. | Dylan16807 wrote: | Linked lists are inherently niche on modern hardware. | senderista wrote: | Yes, along with binary trees, but there are cache-friendly | alternatives like unrolled linked lists and B-trees. | Animats wrote: | Yes. Lists were fine when memory was random-access. Today, if | you're linking all over memory, cache misses dominate | performance. So contiguous data structures are preferred. | chapium wrote: | This isn't a linked list issue at all, but a problem with | constructing the nodes. If you construct list nodes in a | contiguous location cache misses are a non issue. | estebank wrote: | That's no longer a linked list, it's a vector. | [deleted] | mikepurvis wrote: | I think the argument is that if the nodes are fixed size | then you can preallocate a big block of them slab-style, | and for the pointers you just use indexes into that array | rather than "real" pointers. | | Potentially there's a small memory savings as well if you | can use 2 or 4 bytes for the index instead of a full 8 | byte pointer. But you're taking on a lot of complexity | and tuning by going this route so you'd really have to | test and make sure the gains were worth it. | mikepurvis wrote: | But that's fundamentally a function of the usage pattern. | If the usage is "construct a bunch of nodes" and then | nothing, then you should be using a vector-type structure | anyway. | | The promise of a linked list is being able to iterate and | make constant-time insertions/deletions wherever you | want. But the more such mutations occur, the more | fragmented your memory will end up, and the more you'll | get hit by the cache issues--especially if your list is | large, exactly where the purported benefits of a linked | list are strongest. | varjag wrote: | Seriously? Are trees also niche? | Dylan16807 wrote: | What kind of tree? A binary tree isn't great for many uses, | and doesn't excel at much. But a nice fat B+ tree removes | the vast majority of pointer-chasing latency. | kybernetikos wrote: | I've got quite keen on B+ trees. It seems like the world | where we had fast main memory and slow drive storage is | not very different to the world where we have fast cache | memory and slow main memory. | Tuna-Fish wrote: | Yes. For the vast majority of uses, you should probably use | a hash table instead. | | There are still niche uses for both linked lists and trees, | but the sad fact is that the relative time it takes to | chase a random pointer compared to doing _literally | anything else_ keeps getting worse, and is already at the | point where frequently linearly copying kilobytes of arrays | is almost always faster than using a linked list. | int_19h wrote: | A hash table is vulnerable to collision attacks, if keys | are derived from any kind of untrusted external input. | Trees might be slow, but they're _consistently_ slow. | deckard1 wrote: | non sequitur. No one mentioned security, nor do any | elementary data structures concern themselves with such | higher level things. If you're allowing unlimited | untrusted data to control internal data structures, | you're going to have a bad time regardless of which data | structure you're using. | eximius wrote: | And rusts default hashing implementation is hardened | against these attacks | btilly wrote: | _...time it takes to chase a random pointer compared to | doing literally anything else..._ | | When I have had to do serious stuff with linked lists, | trees, and so on, I found that it was much, much faster | to assign an array of nodes, and allocate the linked list | out of that close together. There was a lot of complexity | in doing so but colocating data to match my access | pattern was a big win. | varjag wrote: | If your data/algorithm calls for a certain dynamic | structure, let's say a red-black tree, replacing it with | array is pointless. | | Last time I looked, kobjects in Linux kernel were still | dynamically linked in a bunch of ways, and tree-like data | structures are widely used. | staticassertion wrote: | I think it's probably fair to call kernel development | niche? And the fact that Linux uses linked lists a lot is | not necessarily a strong case for linked lists, but just | a matter of very specific constraints - like perhaps not | wanting to optimize too much for underlying hardware? | varjag wrote: | I used Linux kernel as an example of easily accessible, | extremely well known, performance tuned open source | project. Of course dynamic data structures are used in | countless other applications, like DBMS, web servers and | web browsers. Really in nearly any non-trivial codebase. | adrianN wrote: | Sure, but algorithms that explicitly need a linked | structure for performance are very few compared to | algorithms that have the same or better asymptotic with a | hash and a vector or for which the input sizes are such | that the large constants pointer chasing causes don't | make up for worse asymptotics. | varjag wrote: | If the algorithm calls for a balancing tree, you'd end up | re-implementing it across of bunch of (linked) hash | tables or over an array treated like a heap. Explicitly, | or if one does not know what they are doing, implicitly. | In either case you would have no performance advantage. | btilly wrote: | If you know what you are doing, you may well realize a | performance advantage. | | For example LevelDB (which is conceptually based on | Google BigTable's design) looks a lot like a balancing | tree in its access patterns, but is a lot faster than a | variety of alternatives. And it is faster in part because | sequential data is stored sequentially in order instead | of using a data structure that results in random access | patterns. | | And no. It doesn't use linked lists. | yazaddaruvala wrote: | Trees and Lists are types of data structures. Both are very | useful in all contexts. | | Linked List is a special kind of List. It typically implies | a non-sequential data layout. | | With all of the modern layers of abstraction, non- | sequential memory access typically means poor cache | friendliness, i.e. poor performance. | | Hopefully that helps explain why Linked Lists are | considered niche, I.e. specific to embedded programming or | in very special cases when benchmarks provide hard data to | use a Linked List. | varjag wrote: | Tree is a special case of a linked list. | | Linked list is definitely sequential (it's a list!), and | it can even be sequentially allocated in memory, | depending on allocator implementation. | dpbriggs wrote: | That's not necessary true. You can implemented binary | trees as an array, so are trees a special case of arrays? | | There can be sequential and non-sequential | implementations of ADTs. | saagarjha wrote: | B-trees aren't too bad. | monadic2 wrote: | > The most important thing to know about these is that you | would almost certainly never do these in a real project: lists | are in `std` but even then the vast majority of cases should | use `Vec`. | | Not relevant to rust, no? | steveklabnik wrote: | That is true of Rust. The linked list is bad though, even | irrespective of this argument about the utility of lists. | grok22 wrote: | I am learning Rust due to it's memory safety features and | comments like this rub me the wrong way for some reason -- they | give me an uneasy feeling that I will run into unexpected | limitations in the language because these corners are not | exercised enough. Because people who know the language are | saying that the kind of data-structures that I deal with day in | and out in my day-to-day work are not the "preferred" thing to | do. | | I do system software/networking software and I deal with a lot | of linked lists and trees and these comments make me feel that | Rust may not be a good language for my use-case in-spite of me | wanting the memory safety features. | mkesper wrote: | It's just not needed to implement trivial data structures by | yourself if it's already implemented before. Also linked | lists might be sub-optimal, see other comments. | jdsully wrote: | The comment applies word for word to C++ as well. Linked | lists are just bad datastructures for modern CPUs. | | I don't use rust so I can't speak to the quality of the std | lib, but I don't think its fair to use the parent comment as | evidence either way. | pas wrote: | Do you currently use C? C++ perhaps? | | I mean basically no other language that currently comes to | mind fiddles so much with the basics. | | In Rust, just as in Java, people write highly optimized safe | and fast data structures and others build upon that. | | In C/C++ it seems every project reinvents the wheel to a | large degree. | | Or am I mistaken? | bobbiechen wrote: | >In Rust, just as in Java, people write highly optimized | safe and fast data structures and others build upon that. | | I'm currently taking a class on API design with Josh Bloch | (Java Collections, Effective Java, etc.), who pointed out | that this wasn't the case until the late 90s or so - he | pulled out an example of a KWIC system [1] described in a | paper [2] from 1971: | | Parnas 1971: _This is a small system [that] could be | produced by a good programmer within a week or two._ | | And then Josh proceeded to show his implementation of the | same system, which he had written in <2 hours just by | making use of the built-in collections, regular | expressions, sorting, string formatting, etc. | | It's really cool that these standard data structure | implementations exist and are so accessible, making people | orders of magnitude faster than before. | | [1] https://en.wikipedia.org/wiki/Key_Word_in_Context [2] | https://prl.ccs.neu.edu/img/p-tr-1971.pdf | grok22 wrote: | I use C. And you are right, in my field there is a tendency | to a large degree to re-invent the wheel of data-structures | :-). Sometimes justified, but most times not. But mostly | because C doesn't usually have well-known, industry- | standard libraries for common data-structures. Or even if | they do, it's kind of trivial to implement basic data- | structures (not saying they will be bug-free!) instead of | relying on some random distribution of a library from the | Internet. | | BTW, most times, the problem is not performance, but tight | control of memory usage. | int_19h wrote: | It is definitely not the case for C++. It has a decent | standard library - when it comes to collections, richer | than many other languages, in fact - and then there's Boost | for more exotic needs. | dodobirdlord wrote: | Linking my comment from elsewhere. | | https://news.ycombinator.com/item?id=22393723 | btilly wrote: | Don't worry. The language has lots of support for the style | you wish to work in. | | As for the specific data structures, I recommend setting | aside your knowledge of their utility and consider carefully | why they are being criticized. Then run some experiments to | convince yourself of what is true. If you come to the | conclusion that the criticisms make sense, then you can | proceed with the knowledge that the people in question have | thought about the issue and came to good decisions. This | should inspire hope that they have thought about other issues | as well and may have valuable insights. | | For the record, I concluded many years ago that some variant | on arrays/vectors/whatever outperforms linked lists in most | situations by a lot. (There are times when | arrays/vectors/whatever can't work. But they are less common | than most imagine.) And changes in hardware have made that | more true over time. Trees, on the other hand, have a | flexibility that lends itself to many problems that it is | hard to find other structures for many use cases. But even so | if performance is critical, it is worth jumping through a lot | of hoops to make sure that all of the pointers that make up a | tree are physically close together in memory to improve the | efficiency of your caches. ___________________________________________________________________ (page generated 2020-02-22 23:00 UTC)