[HN Gopher] Purely Functional Data Structures (1996) [pdf] ___________________________________________________________________ Purely Functional Data Structures (1996) [pdf] Author : debanjan16 Score : 300 points Date : 2023-05-30 11:43 UTC (11 hours ago) (HTM) web link (www.cs.cmu.edu) (TXT) w3m dump (www.cs.cmu.edu) | dang wrote: | Related. Others? | | _Purely Functional Data Structures in Elm - course lecture notes | (2015)_ - https://news.ycombinator.com/item?id=12145741 - July | 2016 (15 comments) | | _What 's new in purely functional data structures since Okasaki? | (2010)_ - https://news.ycombinator.com/item?id=11056704 - Feb | 2016 (42 comments) | | _Purely Functional Data Structures (1996) [pdf]_ - | https://news.ycombinator.com/item?id=10486481 - Nov 2015 (13 | comments) | | _Okasaki: Purely Functional Data Structures (1996) [pdf]_ - | https://news.ycombinator.com/item?id=8327838 - Sept 2014 (1 | comment) | | _What 's new in purely functional data structures since Okasaki? | (2010)_ - https://news.ycombinator.com/item?id=7081191 - Jan 2014 | (17 comments) | | _Ten Years of Purely Functional Data Structures (2008)_ - | https://news.ycombinator.com/item?id=5701396 - May 2013 (24 | comments) | | _What 's new in purely functional data structures since | Okasaki?_ - https://news.ycombinator.com/item?id=1983461 - Dec | 2010 (2 comments) | | _What 's new in purely functional data structures since Okasaki_ | - https://news.ycombinator.com/item?id=1713594 - Sept 2010 (1 | comment) | | _" Purely Functional Data Structures" by Chris Okasaki [pdf]_ - | https://news.ycombinator.com/item?id=1138979 - Feb 2010 (12 | comments) | | _Teaching, Playing, and Programming: Ten Years of Purely | Functional Data Structures_ - | https://news.ycombinator.com/item?id=112270 - Feb 2008 (2 | comments) | | _Chris Okasaki 's PhD thesis on purely functional data | structures (pdf)_ - https://news.ycombinator.com/item?id=8221 - | April 2007 (1 comment) | okennedy wrote: | This is the canonical guide to reasoning about amortized | runtimes. The exponentially growing ArrayBuffer (O(1) amortized | append) is the classical data structure used to teach amortized | runtimes, but the O(1) amortized functional queue Okasaki | presents here gives a much better intuition for what amortized | runtimes are all about. | odipar wrote: | Okasaki got me interested in confluently persistent data- | structures, way back in the 2000s. | | They seem magical! To be able to combine data from the past with | current data, efficiently! | | They are almost always trees, with the exception of skip-lists, | with all operations O(log(n)), . | | After creating my own programming language Enchilada that is | based on immutable data structures, I started considering what I | deemed "next level": | | _Uniquely represented confluently persistent data structures_ | | Combined with a Merkle tree encoding of such uniquely represented | data structures (they are almost always trees), you can | efficiently and incrementally authenticate them. Think 'block | chain' on steroids, with incremental cryptographic hashes. Or | torrents, if you are into that kind of thing. | weeksie wrote: | This book is near and dear to my heart. Back in the mists of time | (~05?) when I was learning Haskell I reimplemented several of | these in order to get my head around things. Lots of fond | memories. | 2-718-281-828 wrote: | is this being upvoted onto the homepage based on upvoters | actually understanding that this paper from 1996 is of | contemporary relevance and interest or more due to keywords like | "pure", "functional", "data" and "structure"? | paulgb wrote: | I can't speak for everyone, but I upvoted it from nostalgia, | having read the book version over a decade ago. I happened to | be thinking about ordering a copy for the office just | yesterday. | dllthomas wrote: | I have a copy on my desk. The bit about designing data | structures by analogy to number systems (and limiting carry | propagation) is really fun. | bradrn wrote: | > is this being upvoted onto the homepage based on upvoters | actually understanding that this paper from 1996 is of | contemporary relevance and interest ...? | | In my case, yes. | jstrieb wrote: | I've recently used a number of structures that I learned from | this book. Though I don't know if the text represents the state | of the art in purely functional data structures, it's a pretty | seminal work in the area. | UncleMeat wrote: | Nowhere near the state of the art. Lots of improvements since | this was published. | | It is a good book for learning. It is a decent book for | reference. When you want to really fly you will want to reach | for more recent work. | hardlianotion wrote: | Examples of more recent work for us non-specialists? | nequo wrote: | See the commenter here: | | https://news.ycombinator.com/item?id=36125110 | hardlianotion wrote: | Missed it - thanks | bluepod4 wrote: | Your comment is giving early 2010s hipster "you probably never | heard of it" vibes. | ufo wrote: | It is is still the go-to textbook for immutable data | structures. Worth the read. | turtleyacht wrote: | The book is on Amazon, but this submission had those keywords, | _plus_ it 's a PDF. | | Of course it is of contemporary relevance; functional | programming (FP) is all the rage. | | The tricky bit are questions like | | How does this jive with existing JS functional constructs like | "fantasy land," for example. | | How to "recruit" more folks to FP, or even a hybrid approach of | objects interacting functionally | | Game jams using more FP-like data structures? Or more HN | submissions like that. | | The harder things to evaluate are a lot of other topics, news- | like but investigative and curious, or sites that are | essentially selling a service (versus teaching the mechanism | behind it). | | For SaaS stuff, since HN is about startups, I have to let it | slide. But the _hacking_ piece is when one person accomplishes | something with persistent decomposition of sequential problems, | or does something clever using tools or ideas from a different | context. | solomatov wrote: | My understanding is that the book is based on this PhD | dissertation. | turtleyacht wrote: | Oh.. then, yes--this was unabashedly a (positively) | triggered reaction (fortunately or unfortunately). | schaefer wrote: | This comment reads as if there is a clear, contemporary | successor for learning pure functional data structures. | | Is there? If so, please do share a reference. | eointierney wrote: | I remember encountering an early draft when he was still doing | his masters? Absolutely cracking paper, one for the ages. Thanks | Chris :) | lemper wrote: | this literature was my "serious" introduction to fp. implemented | some of the data structure on f#. | mklauber1 wrote: | Definitely read that headline as "Purely Fictional Data | Structures". My disappointment is immense. | 082349872349872 wrote: | Purely Fictional Data Structures include: To | Queue a Mockingbird The Call-stack of the Wild | Tesselation of the d'Urbervilles One Flew Over the Cuckoo | Hash The Data of the Woosters Brideshead Re-visitor | The Catcher in the Trie Les Miser-tables The Nat- | elim of Monte Cristo | rntz wrote: | title typo: "Purely Functional Data Structure" -> "Purely | Functional Data Structures" (pluralization) | | reads a bit weird otherwise - sounds like it's discussing a | particular purely functional data structure when it's actually a | survey of many (pretty much the canonical survey of them, in | fact). | debanjan16 wrote: | Thanks for pointing it out. I totally missed it. Sorry. | tmtvl wrote: | It's weird that there are so many claims in here that the data | structures and algorithms are perfectly performant yet there | isn't even one look at generated assembly or any acknowledgement | of the underlying system that is supposed to _run_ the code. | Proving things are Big O performant is neat, but at some point | the code has to hit hardware. | shepherdjerred wrote: | > at some point the code has to hit hardware | | Yes, but that's not a concern of a computer scientist. | Implementation and execution of the algorithm are up to the | reader. | | It's like complaining that engineers don't do enough novel | computer science research; of course they don't! It's not their | job. | Gravityloss wrote: | Which profession is expected to make progress on actual | software performance? | shepherdjerred wrote: | Computer scientists (and of course engineers) both care | about real-world performance, but some computer scientists | just care about the theory. | fjeifisjf wrote: | That's a very naive view of computer science. That is the | attitude of people who have given up and decided that | computers are too big for science now. | shepherdjerred wrote: | You're right, there are plenty of papers focusing on real- | world performance. I chose not to capture the nuance | because I wasn't sure how to express it succinctly. | scott_s wrote: | How I see it: computer science as an academic field is | roughly split between _theory_ and _systems_. The theory | folks tend to evaluate progress through proofs. The | systems folks tend to evaluate progress through | experiments. | rgbgraph wrote: | I've vouched for this, because it brings up an interesting | point (albeit, one that proven provocative and been | expressed before). Also, many of your comments are dead. | | Big O/algorithmic complexity is interesting in that it | tends to abstract away the architecture of the underlying | processor. A copy is a copy is a copy. Arithmetic is | arithmetic is arithmetic. Regardless of what | instructions/processing units the underlying processor has. | Regardless of data access. Regardless of parallelization. | Regardless of memory complexity. All we use are brutish | units of "work" -- despite not all "workers" being equal. | | It reminds me a bit of scientific management: treating your | "workers" as interchangeable units that carry out what has | been decided to be the most efficient movements and | processes -- completely disregarding the individual | characteristics of the worker. For a humanist example, | consider the individual quirks of the physiology (not only | in size, length, and composition of the skeletal system via | the muscles, bones, tendons and so on that would | necessitate there being specific movements and workloads | that best suit them -- but also in psychology; the brain as | its own work-producing part that is uniquely suited and | most efficient for specific workloads; rather than an all- | encompassing abstract average "best workstyle" that makes | no note of these characteristics, but simply decides to use | external metrics like "time to pick up a box using X, Y, or | Z form.") | | The same parallel can be drawn to computers: different | processors and collections of hardware (what is essentially | the "body and brain") have different quirks and different | workloads they perform best at. For example, GPUs are much | more useful in the case of vectorized/parallel workloads | where the operations are relatively simple (f.e matrix | arithmetic). While you can run an iterative matrix | multiplication algorithm on a CPU in n^3 time -- your data | access costs will be significant. Instead, running a | parallel algorithm on a GPU, with RAM very close by, you | can achieve log^2 n. | | This is where CS research really shines: not running away | from the realities of physical systems, but embracing them. | jove_ wrote: | Also, while I haven't read any further yet, based on the table | on page 3, their big O performance is pretty bad. | gnulinux wrote: | > It's weird | | It's not weird, this is standard in academic computer science. | It would be weird to do _otherwise_. In a theoretical | dissertation /paper like this you can't just randomly bring up | compiled assembly, it's _completely_ and _utterly_ off topic, | it 's not any more on topic than bringing up if the code was | ran by an interpreter, or JVM, or transpiled to Haskell, or ran | on GPU etc... | nerdponx wrote: | I can't speak for other fields, but in statistics and machine | learning, it's not uncommon to see at least some practical | experiments or simulations alongside theoretical results, or | at least some discussion of practical considerations. | | There are definitely plenty of pure theory papers as well, | and pure theory is still a contribution. However one would at | least hope to see subsequent work that does focus on the | practical aspect. | amelius wrote: | "Computer science is no more about computers than astronomy | is about telescopes." | | -- Edsger W. Dijkstra | | However, I do believe that astronomers put references to the | actual instruments they used in their publications. | wk_end wrote: | Other commenters have addressed the key point - it's _not_ | weird. CLRS doesn 't look at generated assembly either. | | One other thing I want to add, though, is that this book was | published in 1996. The CPU landscape at the time was far more | diverse, both in terms of ISA (which assembly?) and also in | terms of capabilities. Even on the x86 line, plenty of people | were still using 486s, which weren't even superscalar, and thus | had pretty strikingly different performance characteristics | than, say, a Pentium Pro (a "modern", OoO CPU). Tying your | algorithmic analysis to a particular piece of hardware would be | pretty useless; providing a satisfactory look at the | performance on so many different (micro-)architectures could | risk overwhelming the content. | | Moreover, at the time, performance was, maybe, a little more | straightforward, making your concerns perhaps not quite as | important. Memory was slower than the CPU but not the way it is | now. Caches were much smaller and less useful. Branch | prediction was more primitive. CPUs had relatively weak | reordering capabilities, if any at all, and simpler pipelines. | There were few dedicated SIMD instructions. This was a world | where linked lists still often made sense, because the hardware | penalties you paid for them were smaller. In other words: in | 1996 the on-paper performance wasn't at quite as big a risk of | diverging quite so wildly from the real-world performance, so | keeping things simple and focusing on the on-paper performance | was less objectionable. | Tainnor wrote: | Yeah, Big-O notation is only part of the picture and yes, | "galactic algorithms" that are theoretically efficient but only | for astronomically big inputs do exist, but in _many_ cases | (especially if your algorithm isn 't doing anything | particularly deep or clever) linear is faster than quadratic | which is faster than exponential on real world input, too. | | Mathematics is the science of modeling and abstraction and that | abstraction exists in order to make the process of knowledge | gathering easier. It works very well for the sciences, so I | would say the method has been proven. Of course, if the models | turn out to be completely inappropriate, that would be | different, but so far, the simple machine models used | (implicitly or explicitly) in the analysis of algorithms seem | to be rather reasonable. | | The alternative that you suggest, trying to benchmark _concrete | implementations_ of algorithms has so many confounders that it | would be very hard to execute properly. I 'm sure people are | doing this too, but the formal Big O proof has a different | quality to it because it does not depend on those particulars. | agumonkey wrote: | I'd even say that maths is what appears when you stop willing | to pay for concretion. | malkia wrote: | For C++ check this one out - https://github.com/arximboldi/immer | and this talk from the author - | https://www.youtube.com/watch?v=_oBx_NbLghY (CppCon 2018: Juan | Pedro Bolivar Puente "The Most Valuable Values") | tpoacher wrote: | I have a personal pet peeve about the misuse of terminology when | dealing with such names, for which the only solution is to go | read the original reference to figure out what they meant by it. | | E.g., in this case, to describe a data structure as "purely | functional" makes zero sense to me intuitively at first. You need | to go read the thesis and realise they're referring to data | structures implemented as algebraic data types, which in the | context of a purely functional language can themselves be thought | of as functions, and can therefore be described as 'functional' | ... | | But unless you do that, the first thought is going to be "huh? | can arrays be further split into imperative vs functional?" "Does | he mean immutable?" "Can I use functional arrays in c?" "Are they | faster/safer than normal arrays?". | | By contrast, I think "Purely Functional Data-Structure Types" | would have been a far more intuitive term ... but I suppose the | author may have felt that clarifying further wasn't punchy | enough, and could have made the thesis more wordy... | Joker_vD wrote: | A data structure _is_ a type, so "data-structure type" is a | pleonasm. Please don't misuse terminology :) | aeonik wrote: | I'm reading this book right now. It's really great so far! | | I've been working a lot with Trees in Clojure, and have been | hitting serious limitations of my understanding. | | I also found this YouTube video from a Clojure conference that | reviews some different strategies for tree traversal in Clojure: | https://youtu.be/YgvJqWiyMRY | | I thought that learning a Functional Lisp would make it really | easy to traverse trees, since that is that the language is doing | to actually execute its code. | | Turns out all the stuff I want to do is simply hard. | | Functional data structures are really awesome though, it just | seems to take a bit of up front investment. | jmkr wrote: | This has been a topic I've wanted to get into for a few years | now, specifically because of Clojure! So if you have any | additional recommendations I'd appreciate it. | | I really enjoyed Friedman's book `Scheme and the Art of | Programming` because it filled in some pieces missing from "The | Little Schemer" (and "Seasoned Schemer"). Building stuff like | `kons`, `map` and all that `letrec` stuff. | | But the big difference between Scheme and Clojure is that in | Scheme, while it's "functional by concept," you get to escape | with `(set! ...)` whenever you want to have a more traditional | ds/algo (like say doing the n queens problem). | | In Clojure you can kind of do that by either escaping to Java, | using protocols (with atoms), or using transients, but I often | feel like there's a "more functional way to do stuff that | hasn't been taught to me." | | I've opened up either the Okasaki dissertation or book or both, | but I've always had trouble reading it, and then sticking with | it. And some stuff like looking at Rosetta code and saying "to | reverse a linked list in a lisp is easy... because it's a cons | cell" seems like cheating. Almost like showing up to an | interview, saying your "linked list" is implemented in an array | structure and then calling `reverse()` on it. | | Will watch that talk from 2014, must not have seen it before. | | I guess, conceptually, day to day things in Clojure does feel | pretty natural, even easier, and I think I have a decent | understanding of it. But then when I look at leetcode type | problems, or something more involved, it takes a lot of mental | effort to translate to it. Especially things like `big O` gets | thrown away in my mental model. I get it, persistent data | structures and all that, but there's still a mystery there. | smabie wrote: | I feel like they are.. not so awesome: they are grossly | inefficient due to all the pointer chasing and are pretty much | guaranteed to be slower than the alternative since they trash | your cache. | aeonik wrote: | Most of the problems I'm trying to solve require those | pointers anyway. | | Tracking diffs and versions of data over time: Version | control systems and RDMS just don't cut it. | | Bitemporal databases are interesting to me as well. | agentultra wrote: | You _feel_ this way but if you do the math and benchmarks you | will be surprised. | | You might not be running GHC's RTS on a resource-constrained | target platform but you can generate the C code that could | run on that platform from Haskell, leveraging the guarantees | you get with using Haskell's advanced type system. | | _Update_ : point is that GHC can optimize code and some of | those optimizations can avoid pointer chasing! But like most | optimizing compilers, some times GHC can miss them, and you | have to do a bit of work to tell GHC where it can avoid | chasing unnecessary pointers (or where it should/should-not | inline, give it some help to know where it can do fusion, | etc). | jfoutz wrote: | I wish I could remember the exact book, I think it's _writing | solid code_. There was a long example about the excel | evaluation engine back in the day when it was shipped on CD's | and had to be perfect. The approach was, use one big dumb | slow, but understandable and correct implementation, in | parallel, use the lightning fast super whiz bang new | implementation. Since both shared the same interface, they | could be swapped out, or both run at the same time. | | I think there is real value in starting with the pure | functional versions, then swapping out when needed. One | problem that seems fairly common in large codebases is using | an object as a hash key. but as the code grows, that object | sort of gets passed around and eventually somebody updates it | without rehashing. That's a tough bug find. They are for me | anyway. | | This is one of those rare cases where you can actually make | it faster later without trashing the overall design. | | I'd encourage starting with the pure functional version | first, every time. I'd go further and say, leave some | opportunity to flip back to the pure functional version in | dev builds for isolating heisenbugs. | | Blah blah, grain of salt, free advice, your milage may vary, | Games have different requirements than webpages, everything | is contextual. | | This is one rare case where it's always worth having the | capability of swapping back and forth is worth it. Just use | it, and think hard before giving it up is a really good | default. | 1MachineElf wrote: | I wanted to verify the book reference, and I think you | could be correct. | | There is a website for it: https://writingsolidcode.com/ | | The Amazon page it links to includes a few Index pages in | the preview. I saw these under _E_ : | | Excel number formatting code 163 | | Excel's dialog handling code 113 | wtetzner wrote: | They're only grossly inefficient if you wouldn't otherwise | need to frequently make copies of large mutable data | structures. It all depends on what you're trying to do. | duped wrote: | There are entire fields of research dedicated to studying | cache oblivious persistent data structures, and they're | almost always trees. | | One of the key points of immutable datastructures is that | they're easy to reason about given that the invariants are | upheld. Designing an efficient memory representation that | upholds those invariants is an entirely separate problem | domain. Not every pointer dereference is slow, and not every | tree is represented with pointers. | mtrimpe wrote: | I agree that it's a great book; but I wouldn't recommend it as | an entry point into understanding purely functional data | structures. | | Okasaki fleshes out a couple of examples and explains his | thought processes and heuristics for developing such data | structures quite well; but they're very much still notes on the | early-stage exploration of the concept. | | From a historical perspective it's still a fascinating read | though and definitely recommend it if you want to read up on | the origins of immutable data structures. | paddw wrote: | It would be cool if someone could translate this into Typescript | or the like, I think it would make it a lot more readable. | haskellandchill wrote: | TypeScript is much less readable than Haskell and OCaml, but | you can easily find translations to TypeScript such as | https://github.com/skeate/lambdata. | nequo wrote: | > TypeScript is much less readable than Haskell and OCaml | | That's like saying that Norwegian is much less readable than | Italian. It is in the eye of the beholder. They can both | express the same concepts but which one is more readable | depends on which one you already know. | substation13 wrote: | Using this to push Brainfuck at work | consilient wrote: | > They can both express the same concepts | | Only in the turing tarpit sense. Out of the box, they have | very different capabilities. For example: | | Higher-kinded types: easy in Haskell, hard in OCaml, mostly | impossible in Typescript. | | First-class modules: OCaml has them, Typescript can sort of | simulate them with extraordinarily unsafe prototype | mangling stuff that you should never ever use, impossible | in Haskell | | Open variants: Easy in OCaml and Typescript, hard in | Haskell | gexahaha wrote: | There's also a nice addendum on cstheory.stackexchange, "What's | new in purely functional data structures since Okasaki?" - | https://cstheory.stackexchange.com/questions/1539/whats-new-... | dang wrote: | A few discussions: | | _What 's new in purely functional data structures since | Okasaki? (2010)_ - | https://news.ycombinator.com/item?id=11056704 - Feb 2016 (42 | comments) | | _What 's new in purely functional data structures since | Okasaki? (2010)_ - https://news.ycombinator.com/item?id=7081191 | - Jan 2014 (17 comments) | | _What 's new in purely functional data structures since | Okasaki?_ - https://news.ycombinator.com/item?id=1983461 - Dec | 2010 (2 comments) | | _What 's new in purely functional data structures since | Okasaki_ - https://news.ycombinator.com/item?id=1713594 - Sept | 2010 (1 comment) | anfelor wrote: | I would also recommend Koen Claessen's simplified finger trees | (https://dl.acm.org/doi/abs/10.1145/3406088.3409026) and Zip | trees (https://arxiv.org/pdf/1806.06726.pdf) for purely | functional skip lists. | mefarza123 wrote: | Since Okasaki's work there have been several advancements and | new developments in the field:(Source: MirrorThink.ai) | | 1. PaC-trees: Supporting Parallel and Compressed Purely- | Functional Collections - 2022: This paper introduces PaC-trees, | a purely functional data structure that supports parallel and | compressed collections. PaC-trees are designed to be efficient | in terms of space and time complexity while maintaining the | benefits of functional data structures. | | 2. Proving tree algorithms for succinct data structures - 2019: | This paper discusses the development of tree algorithms for | succinct data structures, which are compact representations of | data that support efficient query and update operations. | | These is some work in other fields too: | | 1. CyBy2: a strongly typed, purely functional framework for | chemical data management - 2019-12-30: This paper presents | CyBy2, a purely functional framework for managing chemical | data. The framework is designed to be strongly typed and | enforce referential transparency through the type system. | | 2. chemf: A purely functional chemistry toolkit - 2012-12-20: | This paper introduces chemf, a purely functional chemistry | toolkit that provides a set of algorithms and data structures | for working with chemical data in a functional programming | context. | bafe wrote: | [dead] | smasher164 wrote: | RRB trees in particular allow you to benefit from some of the | cache efficiency of regular vectors, while supporting | operations like immutable update. They are used in languages | like Clojure and Scala. | solomatov wrote: | There's a book which looks like it's based on this dissertation, | which probably is a better source to read if you are interested | in this topic: https://www.amazon.com/Purely-Functional-Data- | Structures-Oka... | chalcolithic wrote: | my dream language is rebol with immutable data structures only | 1MachineElf wrote: | What are software bugs that can be avoided by choosing data | structures like these? | | I'm making a broad, high-level presentation about immutability in | technology. At my company we have folks who have heard of it in | the context of ransomware-resilient backups, others who have | heard of it in the context of infrastructure as code, and very | few who have heard of it in terms of data structures (distributed | and non-distributed). My goal is to showcase the concept in | various contexts so that people can better understand its role as | a key design choice in technology. | | Personally I have no experience working on software that utilizes | these, so if others here do, I would appreciate your input on how | these make your software more reliable. | | The emphasis on software reliability and bugs-avoided is because | the audience works under the company's risk-management division. | chrisfinazzo wrote: | Disclaimer: Not a programmer - I do not have a CS degree. One | of my minors as an undergrad was MIS and while I took a | database programming course as a senior, my knowledge is | limited. | | The high-level explanation I recall reading years ago somewhere | (Joel on Software, I think??) was that a) Functional programs | have no side effects, leading to the notion that b) They can, | without much effort, be adapted for parallel tasks. | | The example here was MapReduce, which was the original guts of | the Google Search algorithm. | | E.g, Things are fast because we can send your query to many, | many computers to find an answer, and chances are good that | that machine is (relatively) close to you, which means low wait | times to get your answer. | thethimble wrote: | Purely functional data structures are very common in purely | functional languages like Haskell but are also used in non | functional languages via libraries like immutable.js. | | At a high level, immutability forces you to be extremely | deliberate about state changes in your application. This | improves reasoning/understanding, reduces bugs, and eases | debugging. | | An example of immutability that you might be familiar with | would be react props/state. You don't modify your state. This | makes reasoning about state much more simple. | otto-edgar wrote: | I think the point is that these data structures can be the | foundation of a pure-functional style of programming. | | You wouldn't drop these into an ecosystem where the programmers | are used to dealing with mutable structures. | | Now, whether using a purely functional language correlates with | writing less bugs is a separate discussion, not sure what the | modern consensus is. | mrkeen wrote: | > not sure what the modern consensus is. | | Too good a phrase to pass up! | | Modern consensus is done by constructing a replicated log - | an append only data-structure shared across multiple workers. | It's what you use Paxos for, and it's what Raft streamlines. | red_admiral wrote: | _Immutable_ data structures (not 100% the same thing, but a lot | of overlap) avoid all kinds of concurrency problems, because | you can safely pass them around and you'll never get a data | race. You don't even need any locking (just to make things | complicated, _lock-free_ data structures are another closely | related but not identical concept). | | Once you're running a distributed system, this kind of stuff | comes into its own. | taeric wrote: | Careful with this wording. They avoid shared memory | mutations. They don't necessarily change data races. Rather, | they just change them since, by definition, every edit is now | creating stale data. | | At large, in distributed software, this is a distraction. | Since most passing of messages around from one distributed | piece to the other was already doing a copy across mediums. | Such that sent data was already immutable from the senders | perspective. (Granted, the preparation step can have you step | on your own feet.) | Chris_Newton wrote: | Purely functional data structures are a means, not an end in | themselves. | | Programming with immutable values and the more declarative | style they support does design out at least one infamous source | of bugs: shared mutable state. That has become increasingly | important in recent years, for example because modern chips | support ever increasing numbers of concurrent threads in | hardware and because a lot of the software we write is now part | of distributed systems. | | That programming style has other advantages in terms of | readability and reasoning about code. If you can be confident | that a particular name in the code always refers to the same | value once it's been defined, tracing the flow of data through | a function or through an entire system often becomes much | easier. Having more understandable code naturally tends to | improve reliability, among other advantages. | | If we adopt that programming style then we still need to | represent collections of values and different relationships | within our data, but we can't always use "traditional" data | structures to do that any more because many of them rely on | mutability, which we have excluded. So we need other ways to | represent structured data that _are_ compatible with | immutability and declarative style, and that is what purely | functional data structures give us. | okennedy wrote: | As the peer comment points out, functional data structures | force you to be deliberate about where/when state changes take | place. In particular, they force you to be cognizant of which | _version_ of a data structure a particular piece of code is | referencing. | | Immutable structures can help significantly in reducing race | conditions in parallel code, as it is impossible for one thread | to modify part of a data structure while another thread is | accessing it. Each thread sees a consistent 'version' of the | data structure until the code explicitly replaces it with a new | version. | | Immutability can also help with memory optimizations: A node of | one data structure can be safely re-used in another data | structure. For example, if you need to retain older versions of | a tree, you can share common nodes (this is how GIT encodes | version changes). | alex_lav wrote: | I would love to be educated. I've seen claims about the merit and | value of functional programming throughout my (nowadays | relatively long) programming career. In practice I've never once | seen those values or merits come to fruition - just the same | cycle all software goes through. | | My very direct experience recently has been Scala + cats resulted | in the same buggy nonperformant software it was meant to prevent. | I understand that bad programmers produce bad programs, | regardless of language, but I feel pretty strongly that good | tools prevent some amount of the typical "bad" that makes bad | programs bad (ignoring the obviously maliciously bad examples). | So I don't really understand, and would like to understand, if, | how and when pure FP (and I suppose FP in general) actually | improve quality of code/life outside of toy examples. | mrkeen wrote: | The bottom line is that pure FP means that the same input to a | function gives you the same output. | | When you debug, you just give the program the same input which | was problematic and you get to reproduce the error. | | Persistent data structures make it less wildly inefficient to | do so. | user2342 wrote: | Thanks. I have the printed book, but a PDF is much more | comfortable! | EddieEngineers wrote: | _Why_ would we want to use purely functional data structures? | When do the pros of functional data structures outweigh the | additional complexity? Are there scenarios when a project would | want to pivot from a regular data structure to a purely | functional one? | yawaramin wrote: | Do you use git? The git commit graph is a purely functional | data structure. | toolslive wrote: | they have some nice properties that you might want to benefit | from. For example, they're persistent: every previous state of | the data structure is still in there.. aka snapshots! | haroldl wrote: | The most common point is that they're safe to share between | threads making parallel algorithms easier to invent, | understand, and implement correctly. | | You can also safely re-use sub-structures without performing a | deep copy. For example, if you want to keep a sub-tree around | for later you can do that in O(1) time because it's safe to | keep a reference to it. If it is a mutable tree you don't know | what's going to happen to it so you need to do a deep copy of | the entire sub-tree you're holding on to. This can save a lot | on memory allocation and copying depending on your use case. | wtetzner wrote: | There's a tradeoff between the complexity of the implementation | of a data structure, and the use of one. While the complexity | of implementing purely functional data structures is often | (except maybe in the case of singly linked lists) higher than | their mutable counterparts, actually using them in a program is | simpler and less error prone. | | There are obviously other trade offs as well, like performance | and memory usage. | nimih wrote: | As is explained in the abstract, purely functional data | structures are most useful when you want to avoid assignment, | either due to the restrictions of your programming environment | (f.ex. you're programming in Haskell), or because that is a | property you're interested in for other reasons (e.g. you want | to ensure immutability due to concurrency concerns, or exploit | persistence to improve performance). | rg111 wrote: | Deep Learning applications is one area. | | Traditionally, OO code is written all the time. | | But after I learned JAX/Flax, a light turned on inside my head | and I now write functional Deep Learning code as much as I can. | All my side projects and new code are purely functional in | JAX/Flax. PyTorch has functional API known as functorch, and I | have used it one project. | | Where lots and lots of data in 3,4,5 dimensional tensors exist, | and you need to run lots of transformation on them, and then | you need to multiply a huge number of them thousand times in | each second- functional code makes much more sense and gives | immense sanity and peace of mind. | | Those of you writing Deep Learning code, learn functional | programming principles (immutable data, pure functions, leaving | no side effect, etc.), and apply them to DL via functorch or | JAX. | | Your life will never be the same. | debanjan16 wrote: | Do JAX and functorch have the same level of builtin | functionalities (operations) as the original Pytorch library? | | Where to learn about it other than the documentation? | rg111 wrote: | JAX is very barebones and will require you to write much | more code for the same task than you write in PyTorch. | | Functorch is still new, and honestly, there is little to | learn if you already know JAX. There are some talks from | Meta, and then there is always the docs. | [deleted] ___________________________________________________________________ (page generated 2023-05-30 23:00 UTC)