[HN Gopher] The FBHHRBNRSSSHK-Algorithm for Multiplication is st...
       ___________________________________________________________________
        
       The FBHHRBNRSSSHK-Algorithm for Multiplication is still not the end
        
       Author : zitterbewegung
       Score  : 96 points
       Date   : 2022-10-12 17:36 UTC (5 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | Sharlin wrote:
       | The cryptic "5x52" in the title should be
       | $\mathbb{Z}_2^{5\times5}$. Not sure how best to phrase that in
       | plain text, but at least it shouldn't be "5x52".
        
         | dang wrote:
         | Let's try to avoid this by cutting that bit.
         | 
         | Someone will soon point out why that doesn't work, but in the
         | meantime we may gain a few minutes...
        
         | raverbashing wrote:
         | Z_2^(5x5) might be a better description (or, better, 5x5
         | matrices of binaries)
        
           | Sharlin wrote:
           | Or "5x5 bit matrices".
        
             | [deleted]
        
         | mminer237 wrote:
         | Something like Z25x5?
        
           | jwilk wrote:
           | The multipication sign should be in superscript, but
           | unforunately there's no suitable Unicode character... unless
           | we abuse IPA:
           | 
           | Z25x5
        
             | cassonmars wrote:
             | Might be abuse of IPA, but 100% stealing this trick for
             | platforms that don't let me use LaTeX
        
       | alexmolas wrote:
       | I love how advances in modern approaches (deep learning) push for
       | advances in classical approaches. Something similar happened in
       | physics, where advances in quantum algorithms made researchers to
       | find better classical algorithms.
       | 
       | Science is not a competition!
        
         | robbomacrae wrote:
         | My admittedly less charitable take is that they may have been
         | sitting on this 1 operation optimization for a while, noticed
         | it works on the new method, and only decided to go public now
         | they have seen the positive reception and potential of such
         | collaborations.
        
           | kcexn wrote:
           | More likely, I think it's that the first principles approach
           | they have used to find this optimisation has applications in
           | domains far outside of optimising matrix multiplications.
           | 
           | Only because the Google paper crossed their desk did they
           | even think to try it on matrix multiplications.
           | 
           | And then more or less by luck, it happened to turn out to be
           | slightly better. If it had produced exactly the machine
           | learning algorithm, it would have been less interesting.
           | 
           | The speed of the response makes me feel that the authors are
           | extremely good at their domain, and that finding this
           | algorithm was a reasonably trivial problem for them. The
           | slowness of the follow up paper, I believe is that the
           | authors understand that their mathematical domain is wildly
           | far outside of what most people can easily understand, so
           | they're going to take their time to write it up well, instead
           | of drop a whole bunch of maths definitions that nobody else
           | in the world understands.
        
       | metahost wrote:
       | Aside: what's on with the name of the algorithm? -- does it stand
       | for something?
        
       | jeffbee wrote:
       | This is a bit like the way that Bannister broke a decade-old
       | record with his 4-minute mile, then someone beat his record 6
       | weeks later.
        
         | harveywi wrote:
         | Even more impressive considering that the universe is
         | expanding. So everyone today has to run farther in even less
         | time.
        
           | tiborsaas wrote:
           | Matter is not expanding with space, the forces between atoms
           | are much stronger, so planets stay as-is over time. Instead
           | think of space between galaxies expanding.
        
             | ncmncm wrote:
             | I can't even think of the space between galaxies as they
             | are now.
        
             | yarg wrote:
             | I think he was joking.
        
       | telotortium wrote:
       | Abstract: "In response to a recent Nature article which announced
       | an algorithm for multiplying 5x5-matrices over Z2 with only 96
       | multiplications, two fewer than the previous record, we present
       | an algorithm that does the job with only 95 multiplications."
       | 
       | The "recent Nature article" is, of course, Deepmind's
       | reinforcement learning-guided search for better matrix
       | multiplication algorithms:
       | https://news.ycombinator.com/item?id=33096580.
       | 
       | They improved upon Deepmind's algorithms by applying a method of
       | their own, which is not described in this paper, but which they
       | will publish in an upcoming paper: "Manuel Kauers and Jakob
       | Moosbauer. Flip graphs for matrix multiplication. in
       | preparation."
        
         | TremendousJudge wrote:
         | So it's a teaser paper?
        
           | 3j0hn wrote:
           | I wouldn't exactly call it a teaser. The formulas they
           | published stand on their own, and putting them out there in
           | the wake of all the press of the AlphaTensor paper, is just a
           | good idea.
           | 
           | Of course, these formulas are not super interesting on their
           | own, and so the forthcoming paper on HOW they developed these
           | formulas is in some sense the "real" paper.
        
           | mabbo wrote:
           | I think the point is to publish before anyone else does. If
           | they wait until they're full paper is ready, they might be
           | scooped.
        
             | Yajirobe wrote:
             | Are we calling dibs on research now? Is this a
             | kindergarten?
        
               | sidlls wrote:
               | Read some of the history around natural philosophy some
               | few hundred years ago. It's a real hoot, how petty and
               | ego-driven some of these people were. Leibniz's attempts
               | at taking credit for Newton's hard work come to mind.
               | (Tongue somewhat planted in cheek here.)
               | 
               | Research has always been about getting credit for first
               | discovering something, to some extent.
        
               | girvo wrote:
               | Science/academia has been for as long as I can remember.
        
             | ReaLNero wrote:
             | I'm not sure how to feel about this. If the purpose of
             | science research is to build the understanding we have,
             | teaser articles seem more like a way to put a name on a
             | creation without actually contributing to our understanding
             | of something.
             | 
             | I guess this is an artifact of how cutthroat academia is
             | and how vital it is to have credit for being first.
        
               | TheRealPomax wrote:
               | > If the purpose of science research is to build the
               | understanding we have.
               | 
               | It's not. It never was. It's always been "to build the
               | understanding that I or this small group I am part of
               | has, that gives us an edge for a while until we share it
               | with the world because it looks like someone else might
               | otherwise pretend to have discovered it".
               | 
               | The idea of a noble science, diligently working towards
               | the betterment of mankind, has always been a romantic
               | fantasy at best.
        
               | golem14 wrote:
               | "The Double Helix" by James Watson is still a
               | surprisingly accessible read that's very relevant here.
        
               | Beldin wrote:
               | > _I guess this is an artifact of how cutthroat academia
               | is and how vital it is to have credit for being first._
               | 
               | In fairness, academia has improved much in this regards
               | since the days of Newton. That man was such an arse...
               | even the Wikipedia article about his feud with Leibniz
               | glosses over... everything.
               | 
               | Basically, not only was he a donkey about things, he used
               | all sorts of underhanded tactics to win and reveled in
               | devastating his rival's good standing.
        
               | eynsham wrote:
               | So long as it doesn't slow or otherwise impede the
               | dissemination of the full paper I don't mind personally.
        
               | sweettea wrote:
               | But it is a long principle of academia to avoid being
               | scooped by putting out priority-proving work. Consider US
               | patents; or, further back, I read here yesterday that
               | Galileo published anagrams of his results and Kepler took
               | great delight in reverse-engineering them, accidentally
               | finding other true facts other than Galileo's intended
               | discovery.
        
               | zitterbewegung wrote:
               | If they were able to complete it in a week then it is a
               | big deal... especially about wording if that they would
               | be scooped.
        
         | ckemere wrote:
         | Deep-learning-hype-hater-quick-take: Based on that title (and a
         | glance at the paper), my guess is that rather than brute force
         | and megawatts of GPU power, they found a first-principles
         | solution.
        
           | inasio wrote:
           | On that Google Nature paper, I'm not sure if the 5x5 matrix
           | improvement is the more significant result, or that they
           | figured out a way to optimize directly to SIMD/AVX
           | instructions, resulting in faster execution even if not
           | necessarily fewer operations
        
             | ncmncm wrote:
             | The result is not interesting for ordinary numbers, where
             | the classical ("slower") method is faster. When each
             | element of the 5x5 matrix is itself a matrix, it could be
             | important.
             | 
             | So AVX ops for the method are not interesting.
        
         | naillo wrote:
         | Hope they get a nature article as well.
        
       | alexmolas wrote:
       | Besides from the reduction from 96 to 95 multiplications it would
       | be interesting to know how have they derived these results.
       | 
       | I think it's more interesting and has more impact to know the
       | methodology they followed than just to know the algorithm.
        
       | ajkjk wrote:
       | I'm really interested to know when we will be able to prove that
       | we have the fastest algorithm for some matrix multiplication
       | problem. Is there a theoretical lower bound? (well in this case,
       | for 5x5, there's gonna be an actual lower bound. but for
       | asymptotic cases..?) Is figuring out the theoretical lower bound
       | some kind of undecidable/NP problem? what do we know about it?
        
         | ithinkso wrote:
         | I think there is some educated-believe that it can be done in
         | n^(2+epsilon) but not in n^2. No proof yet ofcourse
        
         | genpfault wrote:
         | https://en.wikipedia.org/wiki/Computational_complexity_of_ma...
        
         | imbusy111 wrote:
         | Obviously a lower bound is O(N*M), but it's not very useful.
         | But I'm not aware of any tighter lower bound.
        
         | klyrs wrote:
         | A lower bound of O(n^2 log n) has been known for quite some
         | time. Good luck going any tighter than that...
         | https://eccc.weizmann.ac.il/report/2002/012/download/ [pdf]
        
           | 3j0hn wrote:
           | This is a little bit apples and oranges, however. An
           | asymptotic bound on the complexity doesn't say a lot about
           | the number of multiplications needed for a fixed value of n=5
           | say. A lot of those complexity proofs say "for large enough
           | n".
        
             | klyrs wrote:
             | Not so! The 5x5 multiplication can be applied recursively
             | to NxN matrices -- at the top level the "5x5
             | multiplication" carves the matrix into a 5x5 grid of
             | "blocks" each of size approximately (N/5 x N/5). It's been
             | a while since I've done this sort of complexity analysis,
             | but I found some nice slides here explaining how to use the
             | so-called Master Theorem for recursive algorithms: https://
             | www.inf.ed.ac.uk/teaching/courses/ads/Lects/lecture3...
             | [pdf]
        
           | henrydark wrote:
           | that's for real and complex fields, and it's not exactly
           | that. I don't think there's anything better than O(n^2) with
           | no conditions
        
             | klyrs wrote:
             | ah, excellent point
        
         | [deleted]
        
         | [deleted]
        
         | [deleted]
        
       | Traubenfuchs wrote:
       | How would one go about finding new ways to multiplicate matrices?
       | 
       | Trial and error? Can the process be automated? Genius gets hit on
       | the head by an apple moment?
        
         | whymauri wrote:
         | Virginia Williams teaches an excellent course:
         | 
         | https://people.csail.mit.edu/virgi/6.890/
         | 
         | Lecture notes are open.
        
           | PartiallyTyped wrote:
           | Thank you for sharing this! Do you know of similar gems?
        
         | Jabbles wrote:
         | See https://news.ycombinator.com/item?id=33096580
        
         | klyrs wrote:
         | For small n, you can reasonably brute-force all (n x n)
         | multiplication formulae in characteristic 2 because there's a
         | naive bound on the number of element multiplications and the
         | coefficients are binary. As n gets large (say, n>3?), the cost
         | of brute-force becomes prohibitive.
         | 
         | Beyond brute-force, I imagine that this problem could be
         | phrased as a max-sat problem, mixed-integer linear program or
         | similar. There are generic solvers that can get solutions for
         | problems of moderate size. But unless it's a fairly rote
         | translation to the solver's native representation, it's often
         | better to write a custom solver and port heuristics into the
         | notation native to the problem. As far as I understand it,
         | that's the approach that the Fawzi et al and TFA took.
        
           | js8 wrote:
           | I was wondering, why use Deepmind when you can just use SAT
           | solver? I am not saying SAT solvers cannot be improved with
           | AI, maybe they can.
        
             | fcholf wrote:
             | Actually, this has been done:
             | https://arxiv.org/abs/1903.11391
        
             | klyrs wrote:
             | Note my uncertainty about the suitability of a max-sat
             | formulation. But also note who did the Fawzi et al result:
             | the DeepMind team at Google. I think that dogfooding their
             | own stuff is an understandable default.
        
       ___________________________________________________________________
       (page generated 2022-10-12 23:00 UTC)