[HN Gopher] The FBHHRBNRSSSHK-Algorithm for Multiplication is st... ___________________________________________________________________ The FBHHRBNRSSSHK-Algorithm for Multiplication is still not the end Author : zitterbewegung Score : 96 points Date : 2022-10-12 17:36 UTC (5 hours ago) (HTM) web link (arxiv.org) (TXT) w3m dump (arxiv.org) | Sharlin wrote: | The cryptic "5x52" in the title should be | $\mathbb{Z}_2^{5\times5}$. Not sure how best to phrase that in | plain text, but at least it shouldn't be "5x52". | dang wrote: | Let's try to avoid this by cutting that bit. | | Someone will soon point out why that doesn't work, but in the | meantime we may gain a few minutes... | raverbashing wrote: | Z_2^(5x5) might be a better description (or, better, 5x5 | matrices of binaries) | Sharlin wrote: | Or "5x5 bit matrices". | [deleted] | mminer237 wrote: | Something like Z25x5? | jwilk wrote: | The multipication sign should be in superscript, but | unforunately there's no suitable Unicode character... unless | we abuse IPA: | | Z25x5 | cassonmars wrote: | Might be abuse of IPA, but 100% stealing this trick for | platforms that don't let me use LaTeX | alexmolas wrote: | I love how advances in modern approaches (deep learning) push for | advances in classical approaches. Something similar happened in | physics, where advances in quantum algorithms made researchers to | find better classical algorithms. | | Science is not a competition! | robbomacrae wrote: | My admittedly less charitable take is that they may have been | sitting on this 1 operation optimization for a while, noticed | it works on the new method, and only decided to go public now | they have seen the positive reception and potential of such | collaborations. | kcexn wrote: | More likely, I think it's that the first principles approach | they have used to find this optimisation has applications in | domains far outside of optimising matrix multiplications. | | Only because the Google paper crossed their desk did they | even think to try it on matrix multiplications. | | And then more or less by luck, it happened to turn out to be | slightly better. If it had produced exactly the machine | learning algorithm, it would have been less interesting. | | The speed of the response makes me feel that the authors are | extremely good at their domain, and that finding this | algorithm was a reasonably trivial problem for them. The | slowness of the follow up paper, I believe is that the | authors understand that their mathematical domain is wildly | far outside of what most people can easily understand, so | they're going to take their time to write it up well, instead | of drop a whole bunch of maths definitions that nobody else | in the world understands. | metahost wrote: | Aside: what's on with the name of the algorithm? -- does it stand | for something? | jeffbee wrote: | This is a bit like the way that Bannister broke a decade-old | record with his 4-minute mile, then someone beat his record 6 | weeks later. | harveywi wrote: | Even more impressive considering that the universe is | expanding. So everyone today has to run farther in even less | time. | tiborsaas wrote: | Matter is not expanding with space, the forces between atoms | are much stronger, so planets stay as-is over time. Instead | think of space between galaxies expanding. | ncmncm wrote: | I can't even think of the space between galaxies as they | are now. | yarg wrote: | I think he was joking. | telotortium wrote: | Abstract: "In response to a recent Nature article which announced | an algorithm for multiplying 5x5-matrices over Z2 with only 96 | multiplications, two fewer than the previous record, we present | an algorithm that does the job with only 95 multiplications." | | The "recent Nature article" is, of course, Deepmind's | reinforcement learning-guided search for better matrix | multiplication algorithms: | https://news.ycombinator.com/item?id=33096580. | | They improved upon Deepmind's algorithms by applying a method of | their own, which is not described in this paper, but which they | will publish in an upcoming paper: "Manuel Kauers and Jakob | Moosbauer. Flip graphs for matrix multiplication. in | preparation." | TremendousJudge wrote: | So it's a teaser paper? | 3j0hn wrote: | I wouldn't exactly call it a teaser. The formulas they | published stand on their own, and putting them out there in | the wake of all the press of the AlphaTensor paper, is just a | good idea. | | Of course, these formulas are not super interesting on their | own, and so the forthcoming paper on HOW they developed these | formulas is in some sense the "real" paper. | mabbo wrote: | I think the point is to publish before anyone else does. If | they wait until they're full paper is ready, they might be | scooped. | Yajirobe wrote: | Are we calling dibs on research now? Is this a | kindergarten? | sidlls wrote: | Read some of the history around natural philosophy some | few hundred years ago. It's a real hoot, how petty and | ego-driven some of these people were. Leibniz's attempts | at taking credit for Newton's hard work come to mind. | (Tongue somewhat planted in cheek here.) | | Research has always been about getting credit for first | discovering something, to some extent. | girvo wrote: | Science/academia has been for as long as I can remember. | ReaLNero wrote: | I'm not sure how to feel about this. If the purpose of | science research is to build the understanding we have, | teaser articles seem more like a way to put a name on a | creation without actually contributing to our understanding | of something. | | I guess this is an artifact of how cutthroat academia is | and how vital it is to have credit for being first. | TheRealPomax wrote: | > If the purpose of science research is to build the | understanding we have. | | It's not. It never was. It's always been "to build the | understanding that I or this small group I am part of | has, that gives us an edge for a while until we share it | with the world because it looks like someone else might | otherwise pretend to have discovered it". | | The idea of a noble science, diligently working towards | the betterment of mankind, has always been a romantic | fantasy at best. | golem14 wrote: | "The Double Helix" by James Watson is still a | surprisingly accessible read that's very relevant here. | Beldin wrote: | > _I guess this is an artifact of how cutthroat academia | is and how vital it is to have credit for being first._ | | In fairness, academia has improved much in this regards | since the days of Newton. That man was such an arse... | even the Wikipedia article about his feud with Leibniz | glosses over... everything. | | Basically, not only was he a donkey about things, he used | all sorts of underhanded tactics to win and reveled in | devastating his rival's good standing. | eynsham wrote: | So long as it doesn't slow or otherwise impede the | dissemination of the full paper I don't mind personally. | sweettea wrote: | But it is a long principle of academia to avoid being | scooped by putting out priority-proving work. Consider US | patents; or, further back, I read here yesterday that | Galileo published anagrams of his results and Kepler took | great delight in reverse-engineering them, accidentally | finding other true facts other than Galileo's intended | discovery. | zitterbewegung wrote: | If they were able to complete it in a week then it is a | big deal... especially about wording if that they would | be scooped. | ckemere wrote: | Deep-learning-hype-hater-quick-take: Based on that title (and a | glance at the paper), my guess is that rather than brute force | and megawatts of GPU power, they found a first-principles | solution. | inasio wrote: | On that Google Nature paper, I'm not sure if the 5x5 matrix | improvement is the more significant result, or that they | figured out a way to optimize directly to SIMD/AVX | instructions, resulting in faster execution even if not | necessarily fewer operations | ncmncm wrote: | The result is not interesting for ordinary numbers, where | the classical ("slower") method is faster. When each | element of the 5x5 matrix is itself a matrix, it could be | important. | | So AVX ops for the method are not interesting. | naillo wrote: | Hope they get a nature article as well. | alexmolas wrote: | Besides from the reduction from 96 to 95 multiplications it would | be interesting to know how have they derived these results. | | I think it's more interesting and has more impact to know the | methodology they followed than just to know the algorithm. | ajkjk wrote: | I'm really interested to know when we will be able to prove that | we have the fastest algorithm for some matrix multiplication | problem. Is there a theoretical lower bound? (well in this case, | for 5x5, there's gonna be an actual lower bound. but for | asymptotic cases..?) Is figuring out the theoretical lower bound | some kind of undecidable/NP problem? what do we know about it? | ithinkso wrote: | I think there is some educated-believe that it can be done in | n^(2+epsilon) but not in n^2. No proof yet ofcourse | genpfault wrote: | https://en.wikipedia.org/wiki/Computational_complexity_of_ma... | imbusy111 wrote: | Obviously a lower bound is O(N*M), but it's not very useful. | But I'm not aware of any tighter lower bound. | klyrs wrote: | A lower bound of O(n^2 log n) has been known for quite some | time. Good luck going any tighter than that... | https://eccc.weizmann.ac.il/report/2002/012/download/ [pdf] | 3j0hn wrote: | This is a little bit apples and oranges, however. An | asymptotic bound on the complexity doesn't say a lot about | the number of multiplications needed for a fixed value of n=5 | say. A lot of those complexity proofs say "for large enough | n". | klyrs wrote: | Not so! The 5x5 multiplication can be applied recursively | to NxN matrices -- at the top level the "5x5 | multiplication" carves the matrix into a 5x5 grid of | "blocks" each of size approximately (N/5 x N/5). It's been | a while since I've done this sort of complexity analysis, | but I found some nice slides here explaining how to use the | so-called Master Theorem for recursive algorithms: https:// | www.inf.ed.ac.uk/teaching/courses/ads/Lects/lecture3... | [pdf] | henrydark wrote: | that's for real and complex fields, and it's not exactly | that. I don't think there's anything better than O(n^2) with | no conditions | klyrs wrote: | ah, excellent point | [deleted] | [deleted] | [deleted] | Traubenfuchs wrote: | How would one go about finding new ways to multiplicate matrices? | | Trial and error? Can the process be automated? Genius gets hit on | the head by an apple moment? | whymauri wrote: | Virginia Williams teaches an excellent course: | | https://people.csail.mit.edu/virgi/6.890/ | | Lecture notes are open. | PartiallyTyped wrote: | Thank you for sharing this! Do you know of similar gems? | Jabbles wrote: | See https://news.ycombinator.com/item?id=33096580 | klyrs wrote: | For small n, you can reasonably brute-force all (n x n) | multiplication formulae in characteristic 2 because there's a | naive bound on the number of element multiplications and the | coefficients are binary. As n gets large (say, n>3?), the cost | of brute-force becomes prohibitive. | | Beyond brute-force, I imagine that this problem could be | phrased as a max-sat problem, mixed-integer linear program or | similar. There are generic solvers that can get solutions for | problems of moderate size. But unless it's a fairly rote | translation to the solver's native representation, it's often | better to write a custom solver and port heuristics into the | notation native to the problem. As far as I understand it, | that's the approach that the Fawzi et al and TFA took. | js8 wrote: | I was wondering, why use Deepmind when you can just use SAT | solver? I am not saying SAT solvers cannot be improved with | AI, maybe they can. | fcholf wrote: | Actually, this has been done: | https://arxiv.org/abs/1903.11391 | klyrs wrote: | Note my uncertainty about the suitability of a max-sat | formulation. But also note who did the Fawzi et al result: | the DeepMind team at Google. I think that dogfooding their | own stuff is an understandable default. ___________________________________________________________________ (page generated 2022-10-12 23:00 UTC)