[HN Gopher] Discovering faster matrix multiplication algorithms ... ___________________________________________________________________ Discovering faster matrix multiplication algorithms with reinforcement learning Author : shantanu_sharma Score : 331 points Date : 2022-10-05 15:12 UTC (7 hours ago) (HTM) web link (www.nature.com) (TXT) w3m dump (www.nature.com) | [deleted] | didibus wrote: | Can someone knowledgeable tell me if it discovered an algorithm | we can understand and re-implement ourself, like is the pseudo- | code for it known? Or is it kind of stuck in the infered function | of the ML model? | bugzz wrote: | yes, they implemented the resulting algorithms and have charts | showing real world performance on GPUs and TPUs | ckrapu wrote: | 50% fewer operations for skewsymmetric matrix vector product is | pretty big, IMO. | xpe wrote: | The paper is a great read -- an interesting approach, fun | theoretical comparisons, plus benchmarks for optimized algorithms | for specific hardware (CPU and GPU, allowing for differing | instruction sets). | | But it doesn't stop there; from the "Discussion" section: | | > One important strength of AlphaTensor is its flexibility to | support complex stochastic and non-differentiable rewards (from | the tensor rank to practical efficiency on specific hardware), in | addition to finding algorithms for custom operations in a wide | variety of spaces (such as finite fields). We believe this will | spur applications of AlphaTensor towards designing algorithms | that optimize metrics that we did not consider here, such as | numerical stability or energy usage. | machinekob wrote: | Wasn't this based on some other paper? I have read something like | that a year+ (?) ago also about deep learning based matrix | multiplication boost. | KKKKkkkk1 wrote: | Reinforcement learning is usually introduced as a technique to | train an agent in simulation that would then be let loose in a | real environment. Interesting that they choose to treat it is a | search strategy. | mikewarot wrote: | I have a gut feeling that there is a faster way to compute | logarithms going from least to most significant bit. How would I | go about using ML to find it? | | [Edit] I think Feynman's algorithm might do it: | | "Consider the problem of finding the logarithm of a fractional | number between 1 and 2. (The algorithm can be generalized without | too much difficulty.) Feynman observed that any such number can | be uniquely represented as a product of numbers of the form 1 + | 2^(-k), where k is an integer. Testing for the presence of each | of these factors in a binary representation is simply a matter of | a shift and a subtraction. Once the factors are determined, the | logarithm can be computed by adding together the precomputed | logarithms of the factors. The algorithm fit the Connection | Machine especially well because the small table of the logarithms | of 1 + 2^(-k) could be shared by all the processors. The entire | computation took less time than doing a division." | radford-neal wrote: | Seems similar to the CORDIC algorithm | (https://en.wikipedia.org/wiki/CORDIC). | adgjlsfhk1 wrote: | note that this algorithm will no longer be good. for precisions | up to roughly 1000 bits small tables combined with minimax | polynomials are optimal and for higher precision, you want to | use more complicated methods. if you're interested, the ARB | library unlikely the currently fastest known methods. | xpe wrote: | Start with these questions: | | 1. How big is the search space? | | 2. What analysis approaches are likely to bear fruit for the | search space? (theoretical analysis? optimization?) | | 3. If optimization is called for, what kind? | svnt wrote: | The big claim turns out to be a little overstated. The claim: | | > AlphaTensor's algorithm improves on Strassen's two-level | algorithm for the first time, to our knowledge, since its | discovery 50 years ago. | | reduces to: | | > AlphaTensor discovers algorithms that outperform the Strassen- | square algorithm, which is a fast algorithm for large square | matrices31,32. Although the discovered algorithm has the same | theoretical complexity as Strassen-square, it outperforms it in | practice, as it is optimized for the considered hardware. | Interestingly, AlphaTensor finds algorithms with a larger number | of additions compared with Strassen-square (or equivalently, | denser decompositions), but the discovered algorithms generate | individual operations that can be efficiently fused by the | specific XLA33 grouping procedure and thus are more tailored | towards the compiler stack we use. The algorithms found by | AlphaTensor also provide gains on matrix sizes larger than what | they were optimized for. Finally, Fig. 5c shows the importance of | tailoring to particular hardware, as algorithms optimized for one | hardware do not perform as well on other hardware. | | So they improved on Strassen's 50-year-old algorithm by | optimizing it for hardware that has only existed for a few years | and de-optimizing it for human use? There is so much cool stuff | here without claiming to do something you really didn't. | atty wrote: | On the other hand, real world performance is really the only | useful metric for matrix multiplication. I don't really care | about the theoretical performance, or the number of operations. | Not disagreeing with your take that the claim is grandiose, | just pointing out that finding a generalizable way to automate | the improvement of what is almost certainly the most important | mathematical operation a computer can do is worth some | attention. It also suggests other mathematical operations could | be improved this way as well - potentially algorithms that | haven't gotten as much theoretical or practical attention as | matrix multiplication. With all that said, they even point out | that other people have worked on this before using other | optimization strategies. I'd guess they got into Nature either | by using the Deepmind name, because Nature really loves | reinforcement learning because it's a cool topic that draws | views, or because they're the first group to find their | algorithm leads to real improvements. (Probably a mixture of | all three, I'd guess) | throwawaymaths wrote: | > the number of operations | | This is a very good proxy for actual real world speed. It's | pretty much "as good as it gets" for most straight | computational tasks, though sometimes memory movement is your | real bottleneck. | fulafel wrote: | It's become a much worse proxy since the memory wall. | sdenton4 wrote: | This is questionable... | | The 'best' algorithms for matrix multiplication are | galactic algorithms that provide no actual benefit. Raw | operation counts are a good proxy for speed, but the big-O | complexity that people actually chase hasn't been | especially helpful for this problem in the last twenty+ | years. | | https://en.wikipedia.org/wiki/Matrix_multiplication_algorit | h... | throwawaymaths wrote: | Probably the most common matrix multiplication is (nx9) x | (9xm) (9 = 3x3 cells from a convnet) . If you can | optimize the shit out of those you might be in business | for something interesting. | | Though to be honest the real slow step in machine | learning is training and the slow step in training is the | _outer product_ of two matrices.... I don 't believe | there is an algorithmic way out of that one. | | For non-ml/non-GF purposes, you might also worry about | numerical stability of these matrix multiplications, | which is not addressed in this paper. | a1369209993 wrote: | > and the slow step in training is the _outer product_ of | two matrices.... I don 't believe there is an algorithmic | way out of that one. | | Well, there are several, but the obvious ones tend to | require strange or unrealistic assumptions about the | hardware. The most obvious such assumption, IMO, being | that the hardware is arranged in 3D space in a manner | roughly analogous to a human brain, which tends to be at | odds with the common practice of mostly-planar | photolithography, and with the preference to be able to | change the network topology experimentally without | building new hardware. | dekhn wrote: | Science proceeds at the rate by which linearly larger | matrices can be decomposed in less than exponential time. | matheusd wrote: | This is excellent! You should make a T-Shirt with this | quote on it! | psychphysic wrote: | Surely theoretical improvement begets real world, especially | in the context of highly specialised hardware. | | It was with theoretical performance improvement that | motivated the creation of SIMD and led to real world speed | ups. | fulafel wrote: | It's interesting how both SIMD and big-o have had so | different reasons for being of limited but still | significant relevance. | | SIMD, and vector processors like it was called in the 70s, | delivered practical speedups in simple benchmarks right | away but most applications dont't take advantage because of | SW engineering reasons. Whereas big-O improvement ignores | important components of performance per unit of time | (memory access and constant factors) and is purely | theoretical in an essential sense. | taeric wrote: | Not necessarily. In particular, this could have slower | growth, but still higher cost on normal workloads. | chrchang523 wrote: | Sometimes. In the case of matrix multiplication, there is a | pretty large backlog of "galactic algorithms" going down to | ~O(n^2.373) that haven't yet led to real-world | improvements. Strassen's ~O(n^2.807) algorithm is only | considered over the basic O(n^3) strategy for n>1000. | HarHarVeryFunny wrote: | Winograd O(n^2.37) is a win for 3x3 convolutions in | cuDNN, so it can be implemented efficiently. | chrchang523 wrote: | My understanding is that the Winograd minimal filtering | algorithms used in cuDNN are different from the O(n^2.37) | Coppersmith-Winograd-descended matrix multiplication | algorithms. But I acknowledge that these can be | considered cousins, produced by the same line of | research. | HarHarVeryFunny wrote: | I'm pretty sure there's no difference. It does seem to be | pretty hard to turn the theoretical win into a practical | one - the GPU kernel needs to be coded extremely | efficiently to match the underlying hardware. AFAIK it's | only a win for 3x3 - maybe for one other size too. | Originally Winograd wasn't supported by cuDNN on NVidia's | Tensor Cores (matmul-specific hardware on more recentish | GPUs), vs CUDA cores, but a Google search seems to | indicate it can be done - not sure if that's in cuDNN | though. | letitgo12345 wrote: | I think they find a N^2.778 complexity algorithm (obtained by | applying their 4x4 discovered algorithm recursively) so this is | not correct. | Ar-Curunir wrote: | You're wrong. They discovered an algorithm with fewer | multiplications for 4x4 matrices. | | They also discovered better algorithms for other dimensions. | dekhn wrote: | That's the DeepMind modus operandi- and why they publish in | Nature. It's in their best interest to claim as much covered | ground as possible so they can keep working towards their end- | goal. But most people who read Nature already know that results | in that journal are overhyped. You just have to apply a | specific prior/translator when you read their papers. | noobermin wrote: | This really is a depressingly true sentiment. Nature is both | the most important publication but so often, the big shots | publish overhyped stuff there that doesn't deserve it but get | away with it since they are bigshots. Underscore here for | another piece of evidence of the dysfunction in science. | brutusborn wrote: | I don't doubt the sentiment is true, but hasn't this kind | of science always been this way? By that I mean authors | inflating the importance of their work; everyone wants to | be seen as having the biggest breakthroughs. | | When I think of dysfunction in 'science' I usually think of | unfalsifiable hypothesis, the repeatability crisis in | Psychology, p-hacking in Medicine, misuse of statistical | methods in Economics and other epistemic issues, but I | don't think of exaggerations like this. | dekhn wrote: | Scientists have long been self-promoters who desire that | their theories become the dominant ones and they use many | techniques to achieve this. | | However, the trend towards maximizing the predicted | outcomes of your research really took off during the | human genomics project. | adalacelove wrote: | > Scientists have long been self-promoters | | The famous ones more than the rest, I guess. To my mind | nevertheless comes Cavendish, my hero. | YorkshireSeason wrote: | I would imagine that it's the metrics universities and | funding agencies apply in promotion decisions. For | example, my (well-known) university decided to measure | impact, and takes "Twitter engagement" as one proxy | metric for impact -- against my explicit recommendations. | I'll leave the consequences to everybody's imagination. | dekhn wrote: | Yes. In fact many scientists post to twitter but don't | look at the replies (Emily Bender is an example) or even | block you if you disagree with them. That's not | engagement and I wonder what the dean would do about a | promotion where the scientist just sort of blathered on | twitter and had lots of followers, but wasn't actually | providing any real scientific value (again, Emily Bender | is an example). | YorkshireSeason wrote: | Deans typically lack detailed technical knowledge to | evaluate candidates (not to mention time), they just go | with the flow, and want to run the department smoothly. | | I think the real mechanism is this: those who make the | decision (those with the most power, i.e. those who bring | in the most funding and can threaten to leave if they | don't get their way) already know whom they want and they | cherry-pick data, eg "impact" figures, to bolster their | case. That enables the dean then to justify the decision | in public with those cherry-picked figures ... (the dean | can hardly say we are hiring X because otherwise top | funding getter Y will leave) | | I think Twitter is less important in STEM subjects than | in social sciences or humanities, as STEM has more | clearcut results. | chabons wrote: | Maybe I'm mis-reading the paper, but my interpretation was that | the algorithm discovered using a reward which scales with HW | performance matches the theoretical complexity of Strassen, but | is more performant on the HW. | | They _also_ identified algorithms which do fewer matrix | multiplications than Strassen, improving the lower bound of | matrix multiplies required (They highlight this in Fig 3). | | In that light, I thought their claim was fair. They've | discovered (different) algorithms which are both theoretically | and practically better than Strassen's. | [deleted] | boole1854 wrote: | This is not a correct summary of the results of the paper. | | First, you cut out the initial part of the sentence about | improving on Strassen's two-level algorithm. Here is the | complete sentence: | | > Particularly relevant is the case of 4 x 4 matrices in a | finite field, where AlphaTensor's algorithm improves on | Strassen's two-level algorithm for the first time, to our | knowledge, since its discovery 50 years ago. | | That is, note that the improvement cited in the latter part of | the sentence is for the _4 x 4 matrix case_. | | But then your next quote is not in reference to 4 x 4 matrices. | That is, for _larger matrix sizes_ , the best algorithm | AlphaTensor discovered had the same theoretical complexity as | Strassen-square, but better performance. | | _EDIT: my next comments are confused and best ignored. See | reply from pxx below._ | | For the 4 x 4 matrix case, the theoretical complexity of | AlphaTensor's algorithm was O(N^2.778) compared with the | Strassen algorithm's complexity of O(N^2.8074), which _is_ an | improvement. | Nihmie wrote: | To be pedantic about this complexity estimate for the 4 x 4 | case, both of them reduce to a O(1). If N == 4, it's a | constant, so we have O(4^2.778) == O(4^2.8074) == O(1). | | Talking about _scaling_ for a problem that has no scaling | factor is a bit odd. | vlovich123 wrote: | If you have to process many such matrices, you're not at | O(1). I would imagine that's what the O is referring to in | this case. | cochne wrote: | If you look at it that way then any algorithm is just | o(n) where n is the number of matrices. O does not care | about constant factors | est31 wrote: | Yeah the paper uses O notation for complexity, but never | when N is constant. For constant N and M, the paper uses | the number of scalar multiplication operations performed as | complexity measure. According to Figure 3, their algorithm | has decreased the best known value from 49 to 47 for 4x4 | matrices, and from 98 to 96 for 5x5 matrices (as well as | some further state of the art improvements). | dgacmu wrote: | In Z_2 - modular matrix multiplication, not conventional | matrix multiplication. And in this case, modular over what I | assume is GF(2), i.e., single bits. | aaaaaaaaaaab wrote: | Modular multiplication over GF(2)... also known as AND? | pxx wrote: | what, no, you have this all wrong. the complexity of | multiplying two 4x4 matrices is clearly constant. the | important thing about the constant, however, is that you can | use this primitive on the block matrices recursively to | possibly improve the asymptotic complexity. | | the complexity improvement in this paper is for arithmetic in | Z_2 (modular arithmetic over single bits). in standard | arithmetic, there is no asymptotic complexity improvement but | the developed algorithms involve more efficiently fusible[0] | operations, which is useful for the target hardware. | | [0] ed: this used to read 'with fewer multiplications,' which | is clearly wrong: fewer multiplications would lower | complexity. Interestingly, the discovered algorithm seems to | have a larger number of additions, but still runs faster. | ColinWright wrote: | I thought that savings on the number of multiplications | leads to savings in complexity. For example, multiplying | 2x2 matrices using 7 multiplications instead of 8 gives an | improvement in complexity. | | See my comment here: | https://news.ycombinator.com/item?id=33098192 | | It's possible that my understanding is completely wrong, | but your comment is at odds with my other reading, so it | would be useful to get some clarification. | pxx wrote: | sorry; whoops, I don't know why I said multiplications | there. I somehow thought the "more additions" somehow | meant "fewer multiplications," but no, it's just "more | additions but faster performance on target." edited. | | your other comment notes that for some shapes, even in | standard arithmetic, they have found algorithms with | fewer multiplications than best-known. but those don't | seem to extend to an algorithm with better asymptotic | complexity for multiplying general matrices. otherwise | I'd assume they'd claim them :) | sdenton4 wrote: | This is correct, as I understand it. | | Most complexity measures are defined in terms of | asymptotic behavior, so any specific finite algorithm has | 'constant' complexity. This is an obviously unhelpful bit | of pedantry in the theory side. | | In fact, it's as you say; we can compute a 'raw' | complexity for a finite operation (eg, raw count of | arithmetic ops), and then use that operation as a | primitive to create an algorithm with an asymptotic | complexity which depends directly on the finite | operation's 'raw' complexity. | | (complexity theory, and specifically for matrix | multiplication, is a great example of metrics becoming | targets becoming bad incentives, a la Goodhart's Law. The | so-called best algorithms are 'galactic' and therefore of | no use to anyone. There's some fuzzy-headed hope that one | day the people chasing the big-O exponent metric will | come up with something practically useful, buuuuuut... | you actually gotta work practical problem to solve the | practical problem, and that requires different metrics. | This seems to be what's motivating the work under | discussion here.) | boole1854 wrote: | You are correct. I will edit. | a-dub wrote: | if you're in gf2 on a traditional cpu and care about wall | clock time, i start to wonder how something like simd xors | and parity would compare... | skybrian wrote: | Isn't multiplying two 4x4 matrices a finite problem? What | does big O complexity mean in that case? | Rioghasarig wrote: | My understanding of this situation is that they found a way | to multiply 4x4 matrices that requires fewer | multiplications than Strassen's . This implies that that | can generalize to a way to multiply n x n matrices with a | better theoretical complexity than Strassen's algorithm. | adgjlsfhk1 wrote: | any fixed size multiplication algorithm has a corresponding | general algorithm where you recursively divide a matrix | into blocks and treat each block as an element. for any nxn | algorithm that takes k multiplications, this yields a | general algorithm with runtime n^log_n(k) | ColinWright wrote: | How can I reconcile your comment here with the comment in | https://news.ycombinator.com/item?id=33098192 ?? | throwawaymaths wrote: | No, they count multiplications, 47 vs 49 in the 4x4 case. This | assumes that multiplications are the heavy lift relative to | additions, memory moves, etc. this is not a bad assumption, and | is hardware independent, although there are some proposed | 'exotic' hardware solutions where addition and multiplication | are equally difficult (via lookup table, limited to about 8-bit | x 8-bit) or where addition is harder than multiplication (log | space number systems) but I don't think these have been | deployed | lifthrasiir wrote: | That assumption didn't age very well. In many if not most | modern architectures multiplication still does take more | cycles than addition but can have a higher throughput if | scheduled well, and fused multiply-add can be as fast as a | single multiplication, essentially giving a free addition. | throwawaymaths wrote: | > but can have a higher throughput if scheduled well | | Unlikely to be true for matrix multiplications, which have | well-defined data dependencies. | | > and fused multiply-add can be as fast as a single | multiplication, essentially giving a free addition. | | Yes, this supports the assumption that multiplication is | the heavy lift. | tlb wrote: | With sufficiently parallel hardware, you can do an entire | mat4 * mat4 multiplication in 3 cycles. First do all 64 | multiplications in parallel, then do 32 adds, then 16 | adds to get the final answer. | | For operations in GF(2) where they claim a result, a | multiply is just an AND gate and an add is an XOR gate. | So the fully parallel hardware version is 64 AND gates | and 48 XOR gates, with a total gate delay of 3. This is a | trivial amount of hardware and could easily be an | instruction in some alternate universe where it was | useful. | klyrs wrote: | You missed the part where the "4x4 matrix multiplication" | refers to multiplication of NxN matrices logically | partitioned into sixteen blocks of size (N/4)x(N/4), done | recursively. | WhitneyLand wrote: | In general to speed up matrix operations how much is some sort of | result caching applied? | | For example if you profile calculations done when training a | model, is there any significant repetition happening that would | allow some kind of benefit from table lookups of certain | solutions? | bitwize wrote: | It's begun. AI is improving itself. | chillee wrote: | Quoting myself on twitter: | | https://twitter.com/cHHillee/status/1577713102434361344 | | I'm quite suspicious about their hardware benchmarks. They're not | writing custom kernels, they're relying on a graph compiler like | XLA to automatically fuse their decomposed matmuls (and my guess | is that XLA will not be very good at this). | | Moreover, as far as I can tell, they don't report absolute | performance numbers anywhere. In other words, I suspect that a | naive N^3 matrix multiplication would absolutely smoke them in | performance. | [deleted] | puffoflogic wrote: | Why the heck is a math paper in Nature? I'd put Nature somewhere | just below vixra in terms of math credibility. | 323 wrote: | If you could find a way to train cats to solve integrals, | wouldn't that be a Nature paper and not a math one? | optimalsolver wrote: | Now, were these algorithms discovered, or invented? | | I.e., have they always been there, just sitting in Platonic space | waiting for a conscious mind to stumble across them, or have they | just now popped into existence? | WFHRenaissance wrote: | Discovered. I'd go as far to say that all "invention" is | actually just discovery that we find useful. | bjourne wrote: | Never heard of the "matrix multiplication tensor" before. Is | there some digestible blog post or article somewhere explaining | what it is and why it is useful? | HarHarVeryFunny wrote: | Can someone explain the approach here? | | I understand that they transform the problem into a gamified 3-D | matrix decomposition, but what exactly is the motivation for | using RL to beat this decomposition game ? Why not just use, for | example, an evolution algorithm to find incrementally better | decompositions ? | a-dub wrote: | this is cool. i suppose it's only a matter of time until we see | optimizing compilers that use transformer-rl style searches for | subsets of their optimization and codegen. | mhh__ wrote: | Generating pgo data with graph ML is already a thing | a-dub wrote: | i'm guessing most of that stuff is register allocation and | cache management. | | rl-searches could potentially rewrite entire algorithms. | | edit: (more musing) maybe the future of software engineering | will be authoring bullet proof acceptance tests. | Psyladine wrote: | Nevermind compilers, the possibility of updating architectures | & code bases by automation. A true feasible attack on tech debt | and legacy code. | danuker wrote: | Will also generate some fun bugs! | mirekrusin wrote: | Unless it provides or simply works within formal | verification? | f1shy wrote: | Formal verification does not prove lack of bugs. In best | case, can only catch one certain type of bugs. | | https://smartech.gatech.edu/handle/1853/62855 | adrianN wrote: | Formal verification is very good at proving that compiler | transformations preserve semantics. Programming language | semantics are pretty well specified (at least for some | programming languages...), so the chance of bugs in the | specification are low. | wokkel wrote: | It can catch all kinds of bugs, but you have to ask the | correct questions. So it comes down to define what a bug | is and the assumptions you use for testing. And therein | lies the problem: what constitutes a bug? A function that | add two numbers but bever terminates might be considered | bugfree if forget to include as a bug that not giving an | answer before the end of the universe is faulty | behaviour. We humans are terrible at writing these kind | of specifications, so formal verification as a method | just pushes the correctness from code to specification. | hdjjhhvvhga wrote: | Relax - we'll soon have self-learning debuggers, too. | iKlsR wrote: | Whenever I see stuff about automated algorithms it reminds me | of a movie, can't remember the name or plot exactly but they | create a being or upload someones consciousness and the "ai" | is solving all the problems the world faces until it reaches | the last one, humans. It then begins to synthesize some | airborne virus or nukes that would scour us from the earth. | We basically created tech that would eventually optimize us | out of existence as a cancer to the earth. | | Edit, found it Transcendence (2014), also it seems my brain | injected some random thoughts into that plot, it's not about | that at all but the point stands! | haolez wrote: | Transendence is one of the few movies that present an | almighty AI that's benevolent in nature. | ASalazarMX wrote: | Must have been an embarrassing incident for the engineer | that commented out the "Don't kill humans" constraint for a | quick debug session. | jeffbee wrote: | Superoptimizers are already extremely good at discovering | latent flaws in high level source code, and in compilers. | ColinWright wrote: | Quoting: | | > _... AlphaTensor finds an algorithm for multiplying 4x4 | matrices using 47 multiplications in Z_2 , thereby outperforming | Strassen's two-level algorithm, which involves 7^2 = 49 | multiplications. By applying this algorithm recursively, one | obtains a practical matrix multiplication algorithm in Z_2 with | complexity O(N^2.778)._ | | > _Moreover, AlphaTensor discovers efficient algorithms for | multiplying matrices in standard arithmetic; for example, | AlphaTensor finds a rank-76 decomposition of T_{4,5,5}, improving | over the previous state-of-the-art complexity of 80 | multiplications._ | loufe wrote: | This is completely besides the matter, but reading "provably" in | the abstract is a frank reminder of how terrible English | spelling/pronounciation is. I can't imagine I'm the only well- | read native English speaker who read this as "prov-ably" on first | take. I don't know about most languages, but you just don't get | nonsense like this in French, at least. | eigenschwarz wrote: | https://www.merriam-webster.com/dictionary/provably | | Edit: Sorry, missed your point on first reading, which perhaps | reinforces your point. I read papers like this for a living so | the language is unambiguous but I see how it might not be for | others, even if English is their first language. | pwdisswordfish9 wrote: | I agree. This is completely besides the matter. | evmar wrote: | English is terrible for sure, but this comment is pretty funny | to compare it to the language that spells /kes k@ se/ as | "qu'est-ce que c'est". | elcomet wrote: | I don't understand your point, provably means which can be | proven, is there another meaning I'm missing ? | ogogmad wrote: | He's talking about the spelling not matching the | pronunciation | 613style wrote: | Think "probably" vs "provably" (it's about | pronunciation/spelling) ___________________________________________________________________ (page generated 2022-10-05 23:00 UTC)