[HN Gopher] I can't believe that I can prove that it can sort ___________________________________________________________________ I can't believe that I can prove that it can sort Author : Raphael_Amiard Score : 229 points Date : 2022-07-04 10:24 UTC (12 hours ago) (HTM) web link (blog.adacore.com) (TXT) w3m dump (blog.adacore.com) | donatj wrote: | Is this not a standard sorting algorithm? If I'm not mistaken | this is my coworkers go to sort when writing custom sorts. | Jtsummers wrote: | It manages to be more inefficient than most, and will even | disorder and then reorder an already sorted input. Bubble sort | (a very low bar to beat) doesn't even do that. If you've only | got a small number of items, it doesn't matter unless you're | sorting a small number of items a large number of times. | dgan wrote: | So, _Lionel_ knew how to implement a 3D Renderer, but had no clue | about O(n) complexity neither other sorting algorithms? I am | buffled | im3w1l wrote: | That's self-taught for you. Will know some advanced stuff, and | miss some basics depending on what they found interesting or | useful. | Agingcoder wrote: | The exact same thing happened to me, and this is how I | discovered algorithmic complexity. Sorting my triangles took | forever (I saw it in the profile, taking a whopping 90% of the | time ) and I eventually figured there might be a proper sorting | algorithms out there. | | I was at the same time happily churning out assembly code, | talking to the vga card, writing a dos extender, etc. | | You can actually do quite a few things without formal | education! | touisteur wrote: | Well, Hi there, it me. Back then I was 17 and wading through | software rendering so (light) mesh tessellation, triangle | projection (twice, for a stereoscopic anaglyph renderer), | triangle filling, texture mapping, with at most 300 objects, | all in crappy crashy C (first with VESA, then SDL eased my | life...) with lots of misunderstandings about pointers. | | I was welllll over my head with complex stuff, and sorting | didn't appear in the profiles, so... I guess you can call that | profile-guided learning? I had the formal training, later on, | and even then it didn't stick until I faced the actual | complexity problem head-on. | | I'll never forget that whole weekend with my AMD Athlon 900 at | 100% CPU sorting a 200 000 words dictionary... It was still not | finished on Monday. Implemented (a very primitive) insertion | sort and it was done in less than 2 minutes... | | That was my 10000-a-day day https://xkcd.com/1053 | fn_t_b_j wrote: | OJFord wrote: | Why is this algorithm at all surprising? | | (I'm really not trying to brag - I assume I must be missing | something, that I'm naive to think it obviously works, and it | actually works for a different reason.) | | In plain English - 'for every element, compare to every element, | and swap position if necessary' - .. of course that works? It's | as brute force as you can get? | Asooka wrote: | The comparison is not the way you'd expect. Index j goes past | index i, but the comparison doesn't change. What you propose is | a condition more like if ((a[i] < a[j]) != (i | < j)) swap(a, i, j) ; | | Which obviously works. | | The surprising algorithm sorts even though it swaps elements | that are already ordered. | OJFord wrote: | Ok, I think I see why it's a bit weird '1,2,3 when i=1 and | j=3 it swaps them anyway' sort of thing? | | But i-loop comes through 'afterwards', so when i=3 (value now | 1) and j=1 (3) it sets them straight. | | It still seems quite intuitive, but I think I cheated by | skimming over it initially and thinking it was clear, whereas | actually I've thought about it more now. | | (Not to compare myself to anything like him, I'm no | mathematician at all, but I'm reminded of an amusing Erdos | anecdote in which he apparently paused mid-sentence in some | lecture, left the room, and came back some time later | continuing 'is _clearly_ [...] ' or similar!) | YetAnotherNick wrote: | If the second loop is from j = i to n, it is easy to see that | it will sort in decreasing order. But notice j = 1 to n, then | suddenly it will sort in increasing order | shp0ngle wrote: | look again at the comparison direction. | | It is opposite! | | It works but not as you think it does. | smusamashah wrote: | Here is how it looks. | https://xosh.org/VisualizingSorts/sorting.html#IYZwngdgxgBAZ... | | If you compare it with both Insertion and Bubble sort. You can | see it looks more like insertion sort than bubble sort. | bjterry wrote: | The video from the article | (https://www.youtube.com/watch?app=desktop&v=bydMm4cJDeU) is | much better because it highlights the index of the outer loop, | which is unclear from the cascading visualization there. By | seeing the indexes it becomes clear that (1) in the area before | the outer index, every value gets swapped in and out of the | outer loop location to put them in order, and (2) at the end of | the first outer loop iteration, the largest element will be at | the location of the outer loop index, and so everything to the | right is unchanged in each iteration. | smusamashah wrote: | This is nice. I had a way to highlight indexes at one point | in this tool. Got rid of it when I was trying to add a way to | visualize any custom algo. I should add it back. It adds more | value/info to the visualization. | calmingsolitude wrote: | And here's a version I wrote that visualizes it on your | terminal https://github.com/radiantly/simplest-sort | iib wrote: | I guess it can be thought of as an unoptimized insertion or | bubble sort. | | I think it is very possible to write this algorithm by mistake | in intro compsci classes when you try to code a bubble sort by | heart. I would think TAs may have many such instances in their | students' homework. | naniwaduni wrote: | There's a surprisingly large class of "sorts people | accidentally write while intending to write a bubble sort". | | This one is kind of special, though, since it's somehow | _more_ offensive to intuition than bubble sort itself. | mmcgaha wrote: | I am guilty. I wrote this sort for a gnu screen session menu | years ago and even named my function bubsort. | raldi wrote: | Is the visualization supposed to animate? I can't figure out | how to make it start. | Jtsummers wrote: | There is a "start" button at the end of the list of sorts. | Avamander wrote: | If you click it multiple times it's wonderfully chaotic. | FrozenVoid wrote: | Combsort is far more elegant and faster algorithm. I've wrote a | type-generic combsort a while ago here: | https://github.com/FrozenVoid/combsort.h | | (Combsort as well as mentioned algorithm also consists of two | loops and a swap) | JadeNB wrote: | I don't think that the goal here is to show a fast and elegant | sort, but rather to show that a sorting algorithm that seems | like it can't possibly work actually does. That is, probably | no-one will learn from this article how to sort better, but | hopefully people will learn from this article how to formally | prove things (e.g., about sorting) better. | touisteur wrote: | Yes (co-author here) that was exactly the point. Thanks for | putting it clearly. | fweimer wrote: | Isn't the proof incomplete because it does not ensure that the | result is a permutation of the original array contents? Just | overwriting the entire array with its first element should still | satisfy the post-condition as specified, but is obviously not a | valid sorting implementation. | touisteur wrote: | It was touched upon in the post: 'Tip: Don't prove what you | don't need to prove' (and surrounding text). With a link on | another proof for another sorting algorithm. | yannickmoy wrote: | and there are actually multiple ways to prove that the result | is a permutation of the entry, either by computing a model | multiset (a.k.a. bag) of the value on entry and exit and | showing they are equal, or by exhibiting the permutation to | apply to the entry value to get the exit value (typically by | building this permutation as the algorithm swaps value, in | ghost code), etc. | | but none of this makes a good read for people not already | familiar with program proof tools. | carlsborg wrote: | Vaguely remember seeing this in the first few chapters somewhere | of Spirit of C by Mullish and Cooper. | ermir wrote: | Isn't the sorting algorithm in question the famous BubbleSort? I | understand the value of formally proving it works, but why is the | name mentioned nowhere? | [deleted] | justusthane wrote: | Nope - here's the original paper on the algorithm in question: | https://arxiv.org/pdf/2110.01111.pdf | palotasb wrote: | No, it's not. Previously discussed on HN here: | https://news.ycombinator.com/item?id=28758106 | (https://arxiv.org/abs/2110.01111) | smcl wrote: | It feels similar at first blush but it's not really. In bubble | sort you compare/swap adjacent elements, and exit if you make a | pass through the collection without making any changes. Whereas | this will compare/swap the element at every index to every | other index, and just exits when it's done performing all those | comparisons. | ufo wrote: | It's actually closer to insertion sort. | hobo_mark wrote: | Wow, I wish we had these built-in provers in VHDL (which is | basically Ada in sheep's clothing). | freemint wrote: | They are there. You just have to pay a lot to the right | companies. | nabla9 wrote: | >which is basically Ada in sheep's clothing | | Only the syntax is similar. | Stampo00 wrote: | I've been watching those mesmerizing YouTube videos visualizing | sorting algorithms lately. The header of this article uses a | screen cap from one of them. | | Them: So what shows do you watch? | | Me: ... It's complicated. | | There are a lot of different sorting algorithms. Like, _a lot_ , | a lot. | | As I watch them, I try to figure out what they were optimizing | for. Some only scan in one direction. Some only use the swap | operation. Some seem to do the minimum number of writes. Some are | incremental performance improvements over others. | | When I see an algorithm like this, I don't assume the person who | wrote it was an idiot. I assume they were optimizing for | something that's not obvious to me. Its only modifying operation | is swap, so maybe that operation is faster than an arbitrary | insert for whatever system or data structure they're using. There | are no temporary variables besides loop counters, so maybe | they're on a memory-constrained environment. There's barely any | code here, so maybe this is for a microcontroller with precious | little ROM. Or maybe they're applying this as a binary patch and | they have a strict ceiling to the number of ops they can fit in. | | Or maybe it's just the first sorting algorithm they could think | of in an environment that doesn't ship with one and the | performance is adequate for their needs. In that case, it's | optimized for developer time and productivity. And honestly, it's | a far more elegant algorithm than my "naive" version would be. | | These are all valid reasons to use a "naive" sorting algorithm. | ki_ wrote: | a "new sorting algorithm"? Pretty sure it existed 50 years ago. | fdupress wrote: | Every time somebody proves that a sorting algorithm sorts, they | forget to prove that the algorithm doesn't just drop values or | fill the entire array with 0s (with fixed-length arrays, as | here). | | On paper: "it's just swaps" Formally: "how do I even specify | this?" | | (For every value e of the element type original array and the | final array have the same number of occurrences of e. Show that | that's transitive, and show it's preserved by swap.) | touisteur wrote: | Hi, co-author here. We actually touched upon this subject in | the article: 'Tip: Don't prove what you don't need to prove' | and surrounding text. Also previous comment by Yannick: | https://news.ycombinator.com/item?id=31980126 | fdupress wrote: | Looked for it and missed it. | | And thanks for writing this up, by the way. Having people | less familiar with formal methods try and write up their | experiences is, I think, much more effective at letting | people in than having experts try and write tutorials. | MrYellowP wrote: | > published at the end of 2021 | | I'm so confused. That "new" algorithm is just BubbleSort? | Jtsummers wrote: | As addressed in other comments in this thread and past threads | on this sort (when the paper came out), it is _not_ bubble | sort. Bubble sort only compares adjacent items and swaps them | if they are in the wrong order. This has the effect of | "bubbling" the largest value to the top (the way it's often | written, could also reverse it to push the smallest down | first). | | This sort is like a worse performance version of insertion | sort. But insertion sort creates a partition (and in doing this | performs a sort) of the first _i+1_ items, with _i_ increasing | until the entire sequence is sorted. This one will scan the | entire sequence every time on the inner loop instead of just a | subset of it and usually perform more swaps. | prionassembly wrote: | I taught myself this exact algorithm at 8 or 10 by sorting | playing cards (we established an order for the kind, diamond < | leaf < spades < heart) in a very very very long plane trip where | I had neglected to bring comic books or something. | ufo wrote: | The comments so far have tended to focus on the proof itself, but | for me the coolest part of the blog post were the formal methods. | | Does anyone here also use SPARK for this sort of thing? Are there | other formal methods tools you'd use if you had to prove | something like this? | im3w1l wrote: | Yeah, I see this as a very interesting reply to the TLA+ post. | Might spend my evening diving into Ada and Spark. | | Edit: Some things I noticed. The package gnat-12 does not have | gnatprove. Ada mode for emacs requires a compilation step that | failes with gnat community edition. With alire there is no | system gnat so it cannot compile it (quite possible I'm missing | something). In the end I gave up on using emacs. Gnatstudio | wouldn't run for me until I realized it needed ncurses. It also | had some unhandled exceptions for me (raised PROGRAM_ERROR : | adjust/finalize raised GNATCOLL.VFS.VFS_INVALID_FILE_ERROR: | gnatcoll-vfs.adb:340), but in the end I managed to get it up | and running. | | Edit2: After playing around with it, I'm extremely impressed | with what spark can do. I made a function to add 7 to numbers. | Tried putting a post condition that the return value is bigger | than input. "Nuh uh, it could overflow". Ok so I add a pre | condition that numbers must be less than 100. "Uhm, buddy you | are passing in 200 right there". This is useful stuff for real | and easy to write too. | yannickmoy wrote: | you can ask Alire to install the latest GNAT it built, or to | use another version installed on your machine, see | https://alire.ada.dev/transition_from_gnat_community.html | | It also explains how to install GNATprove: ``alr with | gnatprove`` | im3w1l wrote: | Sorry if it was confusing I kind of jumped between the | issues I had with various approaches. I did manage to get | gnatprove through alire through just that command, it was | the apt gnat that didnt have gnatprove. What I wasn't sure | how to correctly do with the alire install was | cd ~/.emacs.d/elpa/ada-mode-i.j.k ./build.sh | ./install.sh | | Actually I didn't get that working with any of the options | I tried I guess. | layer8 wrote: | I predict that one day it will be common for this to be a | builtin language feature. | [deleted] | twawaaay wrote: | I actually used this algorithm a decade ago to implement a log- | based, transactional database system for an embedded system with | very low amount of memory and requirement that all memory be | statically allocated. | | To the frustration of the rest of the development team who first | called me an idiot (I was new) then they could not make quicksort | run as fast on inputs that were capped at something like 500 | items. Apparently, algorithmic complexity isn't everything. | delecti wrote: | I fee like anyone who was surprised that algorithmic complexity | isn't everything, probably didn't totally understand it. The | assumptions (like ignoring constants) are straight out of | calculus limits. That (+10000) on the end doesn't mean anything | if you're sorting an infinite list, but it means a lot if | you're sorting 15 (or in your case 500) entries. | twawaaay wrote: | Well, it actually is kinda worse (or better, depends how you | look at it). | | It is not necessarily +10000, it can also be something like | x5000. | | Because CPUs really, really, really like working short, | simple, predictable loops that traverse data in a simple | pattern and hate when it is interrupted with something like | dereferencing a pointer or looking up a missing page. | | So your super complex and super intelligent algorithm might | actually be only good on paper but doing more harm to your | TLB cache, prefetcher, branch predictor, instruction | pipeline, memory density, etc. | | So there is this fun question: | | "You are generating k random 64 bit integers. Each time you | generate the integer, you have to insert it in a sorted | collection. You implement two versions of the algorithm, one | with a singly-linked list and one with a flat array. In both | cases you are not allowed to cheat by using any external | storage (indexes, caches, fingers, etc.) | | The question: in your estimation, both algorithms being | implemented optimally, what magnitude k needs to be for the | linked list to start being more efficient than array list." | | The fun part of this question is that the answer is: NEVER. | simiones wrote: | > The fun part of this question is that the answer is: | NEVER. | | Are you claiming that it is faster in practice to insert an | element at the beginning of an array with 1G items, than it | is to insert it at the beginning of a linked list with 1G | items? | | Or that, on average, the time spent traversing the linked | list (assuming an average number of cache misses) the time | taken to move all elements in the array one to the right? | | I am certain the first item will not be true, and the | second may or may not be true depending on many many other | details (e.g., in a language with a compacting GC, the | linked list elements will often be stored approximately | like in the array case, and you may not pay too high a cost | for cache misses). | twawaaay wrote: | > Are you claiming that it is faster in practice to | insert an element at the beginning of an array with 1G | items, than it is to insert it at the beginning of a | linked list with 1G items? | | You missed that insertion point is selected at random and | must be found as part of the operation. But in practice | if you know data will be inserted preferentially at the | wrong end -- you can just easily reverse the data | structure to make insertions prefer the better end. | | > I am certain the first item will not be true | | I am used to it. Everybody gets it wrong until they spend | some time thinking about it, get it explained and/or run | a real life test. | | Hint: think about the cost of iterating over half of the | linked list versus performing binary search over the | array AND moving half of it. | | The cost of binary search goes to negligible pretty fast | and cost of moving 8 bytes of memory is always going to | be lower than the cost of iterating over one entry (16 | bytes) of linked list. And you have statistically equal | number of both assuming you are selecting values at | random with equal chance for each 64 integer to be next | choice. | | CPU can be moving a lot of bytes at the same time but it | has to dereference the pointer before it can get to | perform comparison before it can dereference the next | pointer... | | Actual algorithmic complexity of both algorithms is | exactly the same. But an array is much more efficient | (instructions per item). | kmonsen wrote: | You are right and this is will know, but not only for the | reason you state. Cache locality is so much better on an | array/vector compared to linked list and that is really | important. | twawaaay wrote: | Good point. Cache locality is another important reason | that I should have mentioned. I have not run tests, but | it may possibly be the most important reason of them all. | | In many linked data structures you can mitigate a lot of | this problem. For example, in the past I have used | contiguous arrays where I would allocate linked elements, | then I would try to exploit how the elements are produced | to get them placed one after another, if possible. | | When data is produced randomly you can try to create | "buckets", where each bucket is backed by an array and | attempts to keep consecutive elements relatively close to | each other. You could then try to rewrite those buckets | in a way that gets amortised. Or you can just straight | write it as if it was an array of sorted linked list | elements where on each delete or insert you have to move | some of the elements, but you only need to reorganise one | bucket and not others. | | All a bit cumbersome and at the end of it you are still | wasting at least half of the cache on pointers. | jasonwatkinspdx wrote: | A fun related thing I first learned about years ago due | to a Dr Dobbs article: some strided access patterns are | FAR worse than random access patterns. This is because | certain strides hit the worst case behavior of limited | associativity caches, dramatically reducing the effective | cache size. | kmonsen wrote: | Here is a video by Bjarne himself explaining this: | https://youtu.be/YQs6IC-vgmo | | Note there is a use case for linked list, | inserting/deleting one or few items once in a while. But | in general you can just use vector for anything that is | not a leetcode problem. | jwilk wrote: | I think you misunderstood the question. | | If you insert an element at the beginning, there's no | list iteration cost. | twawaaay wrote: | Please, read the question. | | It stipulates that elements are selected at random. | | You select k integers at random. | | Some of them might be inserted faster into linked list, | but when you average it over k integers you will still be | slower because inserting into array will be faster, on | average. | | Just think what happens if the integer is inserted at the | END of the list (same probability...) You need to slowly | iterate over entire linked list. While you quickly land | at the result with a binary search on an array and then | pay almost no cost inserting it there. | | If you think about it, ON AVERAGE, you have to search | through half of the list (if you are linked list) or move | half of the array (if you are an array). | jwilk wrote: | The parts of simiones's message you quoted are not about | the average case. They are about the corner case when an | item would be added the beginning, meaning the linked | list has the biggest advantage. | twawaaay wrote: | You don't get to change the rules of the game while the | game is being played. | | The original problem, as stated by me, was: | | "You are generating k random 64 bit integers. Each time | you generate the integer, you have to insert it in a | sorted collection. | | (...) | | The question: in your estimation, both algorithms being | implemented optimally, what magnitude k needs to be for | the linked list to start being more efficient than array | list." | | The cost is not evaluated for any individual insertion, | only for the total cost of inserting k integers. | | For some of the integers the cost of inserting to linked | list will be higher, and for some the cost of inserting | to array will be higher. It does not matter. We do not | care. We just care about what is the cost of inserting k | integers thus randomly generated a) into sorted linked | list and b) into sorted array list. | [deleted] | layer8 wrote: | Well, there is one case where a linked list may be faster: | when due to memory fragmentation you are unable to allocate | contiguous _k_ elements (hence efficiency would drop to | zero) but you would still able to allocate _k_ nodes (or | _k_ / _n_ nodes with _n_ elements each). It can therefore | make sense to use a linked list of arrays with some | sensible limit on the array size (e.g. in the gigabyte | range). As long as a smaller number of elements is used, | the linked list will only have one node and degenerate to | the array solution. | twawaaay wrote: | If that was on a very old operating system (think MS- | DOS?) | | On new operating systems process address space is pieced | together independently from physical space. So you can | have contiguous, multi-GB array even if you don't have | contiguous, multi-GB physical region. | | As you read/write your virtual address space the CPU | detects that there is no TLB mapping and interrupts into | operating system which behind the scenes finds a page, | returns the mapping and continues as if nothing ever | happened. | | This is why a process nowadays can allocate any amount of | memory -- more than there is physical memory on the | machine. It only starts grinding to a halt when you | actually try to use it all. (This actually is one of my | interview questions -- Can a process allocate more memory | than is physically available on the machine? Why does it | work?) | texaslonghorn5 wrote: | If I understand correctly, you would just run the n+1 th outer | loop to insert the new item each time? | [deleted] | dahart wrote: | You sure it was this algorithm and not Bubble Sort? | | > algorithmic complexity isn't everything | | Yeah very true. Or at least hopefully everyone knows that | complexity analysis only applies to large n, and small inputs | can change everything. In console video games it was very | common to avoid dynamic allocation and to use bubble sort on | small arrays. Also extremely common to avoid a sort completely | and just do a linear search on the (small) array while | querying, that can end up being much faster than sorting and | binary searching, especially when the total number of queries | or the number of queries per frame is also low. | twawaaay wrote: | Oh yeah, linear search I did a lot to the same consternation | of other devs. | | The issue was the device was so low in memory (2MB of unified | NV and flash for code, files and operating memory) that there | simply did not exist enough space for a lot of things to be | held to be any problem for the 20MHz ARM7 controller as long | as you did not do anything stupid. 600kB of it was used by | OpenSSL and further 300kB by operating system anyway (I am | aware of how funny it sounds when OpenSSL takes twice as much | space as OS). | eranation wrote: | That is pretty awesome if you came up with this exact | algorithm a decade ago, ( it was published in 2021, but | apparently was in the wild in 2015[1] and I assume | discovered earlier, but no one thought it was interesting | enough to publish). Why didn't you use bubble sort / | insertion sort? (The algorithm in the paper looks like | bubble sort at first look but is basically a clever | insertion sort) what was the benefit of using it this way? | | [1] https://cs.stackexchange.com/questions/45348/why-does- | this-s... | twawaaay wrote: | I am not sure how awesome it is. It looks like something | you can stumble on your own by accident on a whiteboard | on an interview and use successfully even if you don't | know why it works exactly. | | Honestly, I always thought it is a common knowledge and | it is just too simple and for this reason gets omitted | from books. | | People in CS/IT tend to not spend a lot of time on | algorithms with bad complexity and so I am used to people | discounting algorithms with useful properties just | because there is another that is a bit more efficient | when n goes to infinity. | yannickmoy wrote: | For a related eye-opening result, with both practical | application and interesting research backing it, see this | talk on "Quicksorts of the 21st century" at Newton | Institute on Verified Software last year: | https://www.newton.ac.uk/seminar/30504/ | | TLDR; practical complexity computation needs to take into | account things like memory access latency on different | architectures. | Izkata wrote: | Yeah, this sorting algorithm is something you can come up | with by accident if you do bubble sort wrong. I wouldn't | be surprised if a lot of beginners came up with this one | in intro courses. I think I saw it around 2006-2008 when | I was a TA and one of my students couldn't figure out why | his array got sorted in the wrong order (don't remember | for sure though). | | For example, from 2016, here's someone not getting the | difference between the two: | https://stackoverflow.com/questions/40786409/whats-the- | diffe... | jwilk wrote: | The paper discussed on HN: | | https://news.ycombinator.com/item?id=28758106 (318 comments) | drdrek wrote: | He starts from the wrong axiom that its hard to prove and creates | a lot of nonsense over that. | | Its requires just two induction proofs: - One that for I=1, after | N comparisons the largest number is at position 1 (Proven with | induction) its the base case - The other, that for any I=n+1 if | we assume that the first n slots are ordered we can treat n+1 as | a new array of length N-n and solve using the base case proof. | | Talking about computer science and not doing the bare minimum of | computer science is peak software engineering :) | [deleted] | [deleted] | tonmoy wrote: | Note that the equality sign is less than, but the algorithm | sorts in a descending order. | umanwizard wrote: | Not true. | | Clojure code: (defn swap [items i j] | (assoc items i (items j) j | (items i))) (defn weirdsort [items] | (let [ct (count items) indices (for [i (range | ct) j (range ct)] | [i j])] (reduce (fn [items [i j]] | (let [ith (nth items i) jth (nth | items j)] (if (< ith jth) | (swap items i j) items))) | items indices)) ) | | Then in the repl: my-namespace> (def items | (shuffle (range 20))) #'my-namespace/items | my-namespace> items [19 0 13 16 14 18 15 7 10 17 9 11 | 6 1 3 4 12 2 5 8] my-namespace> (weirdsort items) | [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19] | puffoflogic wrote: | That is an elegant proof... for a totally different algorithm. | J starts at 1, not at I. Every element of the loop is subject | to be moved in every single outer iteration. Besides it gets | the comparison backwards. | drdrek wrote: | You are absolutely correct, I was gun ho and got punished | with internet shame, bellow is the correct proof. but the | point still stands. | | Solution is even simpler: | | Empty case after 1 iteration (I=1) the largest number is at | position 1 Base case: after 2 iterations (I=2) the 2 first | elements are ordered, and the largest number is at position 2 | | Assume N case: after N iterations the first N numbers are | ordered (within the sub list, not for the entire array) and | the largest number is at position N | | N+1 case (I=N+1): For Every J<N+1 and A[J]>= A[I] nothing | will happen From the first J where J<N+1 and A[J] < A[I] | (denote x) each cell will switch as A[J] < A[J+1] At J=N+1 | the largest number will move from A[N] to A[I] For every | J>N+1 Nothing will happen as the largest number is at A[I] | | Not part of the proof but to make it clear we get: - for J<x | A[J]<A[I] - for J=x A[J]=A[I] - for J<x A[J]>=A[I] and the | list is ordered for the first J elements | alex_smart wrote: | > Solution is even simpler: | | Press X to doubt. | umanwizard wrote: | No, it works, there's just a bit of an unstated step in the | proof. | | After j ranges from 1 to i, it will still be the case that | the values from 1 to i are sorted. So you can assume that j | starts at i without disturbing the induction condition. | | The reason this is true is... If A[i] is >= any element A[j] | for j < i, then it is clearly true. Otherwise, there is some | smallest j in 1 to i-1 inclusive such that A[i] < A[j]. Then | A[i] will swap with that A[j], then it will swap again with | the next A[j+1] (because the original A[j] was < A[j+1]) by | the induction case, then with the next element, ... swapping | with every element until j reaches i. | | After this is done we have maintained the condition that A[1] | .. A[i] are sorted after j ranges from 1 to i. | leereeves wrote: | That is the correct inductive step all by itself. | | That's all we need to know: during each iteration after the | first, the algorithm inserts the ith element into the | previously sorted list from A[1] to A[i-1], giving a new | sorted list from A[1] to A[i], and doesn't touch the rest | because A[i] contains the maximum element. | | Then when i=n the whole list is sorted. | phkahler wrote: | I thought his goal was to get the prover to prove it without | understanding it himself. | | By realizing the low-indexed portion is always sorted, you've | already proved the algorithm yourself and the prover is just | checking for bugs in your logic. | | I'm not saying the proof isnt valuable, just that it's not | magical and actually requires the user to understand the | majority of the proof already. | leereeves wrote: | This algorithm is surprising and interesting because it | doesn't work at all like it at first seems to. | | The low-indexed portion is sorted, but isn't guaranteed to | contain the lowest or the highest _i_ elements of the list | (except when i=1), and the list is ultimately sorted in | decreasing, not increasing order. The final sort doesn 't | occur until the last iteration of the outer loop when the | inequality is reversed (the interesting variable, j, is on | the right). | | Because of that, the proof outline discussed here doesn't | work. | | Consider what happens if the unique smallest element starts | at position n. It is placed at the start of the list (the | correct final position) in the final iteration of the outer | loop (i=n), and not before. | | Proof (for simplicity the list is indexed 1 to n): | | Let A[n] < A[k] for all k != n in the initial list. | | Elements are only swapped when A[i] < A[j]. If i != n and j = | n, then A[i] > A[n] = A[j], so A[n] is not swapped. | | Then, when i = n and j = 1, A[i] = A[n] < A[1] = A[j], so | A[1] and A[n] are swapped, placing the smallest element in | position 1 at last. | umanwizard wrote: | It's not true that it's sorted in decreasing order. See my | comment here: https://news.ycombinator.com/item?id=31978942 | leereeves wrote: | Quite right, that was a silly mistake on my part. I did | say the smallest element was placed first, but wrote "the | list is ultimately sorted in decreasing, not increasing | order" backwards. | | I meant to contradict the top post: "the largest number | is at position 1...we can treat n+1 as a new array of | length N-n and solve using the base case proof", which | would result in decreasing order (largest first). | tromp wrote: | If the (i+1)'st outer loop starts with the first i elements in | sorted order, then it ends with the first i+1 elements in | sorted order. | | In fact if k of the first i elements are <= A[i+1], then the | first k iterations of the inner loop will not swap, and the | next i-k iterations will swap. The latter can be seen to | maintain order. | | The last n-i iterations of the inner loop (where j>=i) do not | affect this reasoning and can be omitted, as can the first | iteration of the outer loop. | MattPalmer1086 wrote: | Interesting. It is a bit counter intuitive at first but not too | hard to see how it works. | | After the first main loop, the first item will be the biggest | item in the list. | | The inner loop, as it always starts from 1 again will gradually | replace the bigger items with smaller items, and so on. | JadeNB wrote: | > After the first main loop, the first item will be the biggest | item in the list. | | > The inner loop, as it always starts from 1 again will | gradually replace the bigger items with smaller items, and so | on. | | I think the "and so on" is the point. To put it more precisely, | I think that anyone (with a suitable familiarity with the | tools) can prove formally that "after the first main loop, the | first item will be the biggest item in the list"; but I think | that it might take a bit more work to prove the "and so on" in | a way that satisfies a formal prover. | | (As to the intuition--I can buy that the "and so on" part is | reasonably intuitive, but only at a level of intuition that can | also believe wrong things. For example, just always replacing | bigger items with smaller items doesn't guarantee you'll get a | sorted list!) | MattPalmer1086 wrote: | Sure, formal proof would be a lot harder. I just didn't find | it quite as surprising that it works as the article implied. | asrp wrote: | After each outer loop iteration, A[:i] (the array up to i) is in | ascending order with the max of A at A[i]. | | This is true the first iteration since max(A) is eventually | swapped to A[1]. This is true in subsequent iterations since | during the ith iteration, it inserts the next element, initially | at A[i], into A[:i-1] and shift everything up with swaps with | A[i] so A[:i] is sorted, with the max of A moved to A[i]. After | that no more swaps happen in the ith iteration since A[i] | contains the max of A. | [deleted] | roenxi wrote: | If it looks "obviously wrong" the quick proof that it works ... | the j loop will move the largest unsorted element in the array | into a sorted position. Since the i loop executes the j loop n | times, the array must be sorted (since the n largest elements | will be in the correct order). | | _EDIT_ | | ^W^W^W | | Nope, I'm wrong. I did a worked example on paper. I think the | devious thing here is the algorithm looks simple at a lazy | glance. | | To sort: 4-3-6-9-1, next round pivot for the _i_ loop in []. | [4]-3-6-9-1 9-[3]-4-6-1 3-9-[4]-6-1 | 3-4-9-[6]-1 3-4-6-9-[1] 1-2-3-6-9 & sorted | | I can see that it sorts everything to the left of a pivot, then | because it does that n times it comes to a sorted list. A | reasonable proof will be more complicated than I thought. | phkahler wrote: | The list on the left (index less than i) is always sorted. The | ith element is inserted by the j loop and the rest of the list | is shifted right by one element by repeated swapping with the | ith position. Nothing to the right changes because the ith | element is the max of the entire list, which seems to be a red | herring for analysis. | blueflow wrote: | This somewhat looks like bubble-sort minus the early exit. | MrYellowP wrote: | It _is_ BubbleSort. At least that 's how I've learned it in | the 80s. | zajio1am wrote: | It is just quirky insert-sort. In each outer cycle, it | inserts value originally from the A[i] position to already | sorted sequence A[1]..A[i-1], while using A[i] as a temporary | variable to shift higher part of the sorted sequence one | position up. | jeffhuys wrote: | Small sort implementation in ES6: const args = | process.argv.slice(2).map(Number); for (let i = 0; i | < args.length; i++) for (let j = 0; j < args.length; | j++) if (args[i] < args[j]) [args[i], | args[j]] = [args[j], args[i]]; | console.log(args.join(' ')); | | Didn't expect it to be actually that simple, but now that I wrote | it, it makes complete sense. | | To PROVE it, is another thing. Good work. | [deleted] | eatonphil wrote: | This is a fantastic post! It's great to have a concrete example | of proving a not-trivial algorithm. And somehow this Ada code | feels more approachable to me than HOL4 and other functional | proof assistants I've looked at. | | While it's been on my list for a while, I'm more curious to try | out Ada now. | Jtsummers wrote: | It'll take you less than a week to get proficient enough in it | to make something useful, or at least interesting. It's a very | straightforward language, overall. | | https://learn.adacore.com/ - Pretty much the best free source | until you're actually ready to commit to the language. The | "Introduction to Ada" course took me maybe a week of 1-2 hours | a day reading and practicing to go through. There's also a | SPARK course that takes a bit longer, but is also interactive. | | The language reference manuals for 2012 and 202x (which should | become Ada 2022): | | http://www.ada-auth.org/standards/ada12.html | | http://www.ada-auth.org/standards/ada2x.html | touisteur wrote: | Also, don't hesitate to use Rosetta Code to find useful | little snippets. Not all perfect but a good jump-start. | Jtsummers wrote: | Yep, I wrote a bunch of those RC examples too. It was a | useful exercise when I decided to learn Ada (and with other | languages, too). | touisteur wrote: | Two of the most interesting things that trickled down during | the design of Spark2014: | | 1- contracts are written in the same language as the code to be | checked/proved. This made important to add expressive features | to the language (for all / for some: quantifiers! expression | functions, if- and case-statements and more recently the new | delta-aggregates notation): these additions make the language | far more expressive without too much loss of readability. | | 2- executable contracts: most contracts can be checked at | runtime or proved. And the escape hatch (code only present for | proof) is 'ghost' code which is also a nice addition. | | Lots of little nifty additions to the language, from just | 'thinking in contracts' or 'designing for probability'. | rationalfaith wrote: ___________________________________________________________________ (page generated 2022-07-04 23:00 UTC)