[HN Gopher] 'A-team' of math proves a critical link between addi... ___________________________________________________________________ 'A-team' of math proves a critical link between addition and sets Author : digital55 Score : 137 points Date : 2023-12-06 15:55 UTC (2 days ago) (HTM) web link (www.quantamagazine.org) (TXT) w3m dump (www.quantamagazine.org) | blowski wrote: | The maths goes over my head, but this paragraph was very | interesting: | | > Tao then kicked off an effort to formalize the proof in Lean, a | programming language that helps mathematicians verify theorems. | In just a few weeks, that effort succeeded. Early Tuesday morning | of December 5, Tao announced that Lean had proved the conjecture | without any "sorrys" -- the standard statement that appears when | the computer can't verify a certain step. This is the highest- | profile use of such verification tools since 2021, and marks an | inflection point in the ways mathematicians write proofs in terms | a computer can understand. If these tools become easy enough for | mathematicians to use, they might be able to substitute for the | often prolonged and onerous peer review process, said Gowers. | robertlagrant wrote: | > If these tools become easy enough for mathematicians to use | | Sick burn, Gowers. | gowld wrote: | It's honest constructive criticism. Lean is very hard to use. | It's even hard to install and run. | dumbo-octopus wrote: | Elan makes it actually quite easy. It's among the better | package managers out there, comparable to cargo. | | There's a single bash script to install both the runtime | and the package manager on a variety of platforms, and the | integrated build system and native module bundler is both | powerful and isomorphic, so all you need to learn is Lean. | Without any prior knowledge of Lean I was able to patch | existing build scripts to support platform specific C | extensions in less than half an hour. | kmill wrote: | I helped a very little with the formalization (I filled in some | trivial algebraic manipulations, and I was there for some Lean | technical support). | | It's exciting to see how quickly the work was done, but it's | worth keeping in mind that a top mathematician leading a | formalization effort is very exciting, so he could very easily | scramble a team of around 20 experienced Lean users. | | There aren't enough experienced Lean users to go around yet for | any old project, so Gowers's point about ease of use is an | important one. | | Something that was necessary for the success of this was years | of development that had already been done for both Lean and | mathlib. It's reassuring that mathlib is being developed in | such a way that new mathematics can be formalized using it. | Like usual though, there was plenty missing. I think this drove | a few thousand more lines of general probability theory | development. | cs702 wrote: | _> I helped a very little..._ | | Thank You for helping with the effort, even if it was only "a | very little." | | I love coming across comments like yours on HN. | ska wrote: | >Something that was necessary for the success of this was | years of development that had already been | | For projects like this it is often very thankless work in the | beginning, and can be a real grind. | | You need at least one person with a vision of how cool it | will be in the (poorly defined) future, and a lot of | determination. | zozbot234 wrote: | > Something that was necessary for the success of this was | years of development that had already been done for both Lean | and mathlib | | Yes but the bulk of the work on this project was background | as well. So the ease of use problem should be solving itself | over time as more and more prereqs get filled in. | | BTW, I find it interesting that mathlib is apparently also | getting refactored to accommodate constructive proofs better, | as part of the porting effort to Lean 4. This might encourage | more CS- and program-verification minded folks to join the | effort, and maybe some folks in math-foundations too (though | Lean suffers there by not being able to work with the | homotopy-types axioms). | kmill wrote: | Yes, sure, and that's how mathlib usually gets developed, | project by project and PRing missing background material. | I'm one of the mathlib maintainers, and I wanted to be sure | there's acknowledgment here for the years of Boubaki-style | work among hundreds of volunteers to build up the | mathematics. In a way, now that the paper's been formalized | it's time for the really hard work, which is figuring out | how to restructure it (and mathlib) so that it can easily | support future work. That's writing libraries for you. | | Where are you getting the idea mathlib is being refactored | for constructive proofs? I wouldn't say that's on the road | map. In fact, the core Lean developers are uninterested in | pursuing supporting this. | | Since you're here, are you able to say why constructive | proofs are interesting to program verification or (T)CS | people? I've never heard anyone in TCS even know what | constructivity is -- Lean supports their case where you | write a program/algorithm and then (with classical logic) | prove correctness and other properties. What would | constructive logic give them? | kmill wrote: | I'll mention that the "Lean way" for constructivity is | that you write `def`s for constructions. These can depend | on classical reasoning for well-typedness if you want | (like for being sure a number is in range when indexing | an array), but there is a computability checker. In | particular, the compiler sees if there is a way to lower | the `def` into some simply typed IR, and this IR can be | further compiled to C. If you trust the compiler, then | you can trust whether it's real construction. | | (Of course, you can also go and prove a definition | evaluates to some value if you don't want to trust the | evaluator.) | zozbot234 wrote: | > [W]rite a program/algorithm and then (with classical | logic) prove correctness. What would constructive logic | give them? | | Compared to the usual case of writing the "algorithm" | (i.e. the construction/decision part) and "proof" | separately, it basically offers better structuring and | composability. Imagine writing an "algorithm" that itself | relies for correctness on non-trivial preconditions, or | must maintain some invariants etc. Proving correctness of | a complete development that might involve wiring up | several of these together is quite possible if you can | use constructive proofs, but becomes quite tedious if you | have to work with the two facets separately. | | > Where are you getting the idea mathlib is being | refactored for constructive proofs? | | It's not being refactored _for_ constructive proofs, but | proofs that do use classical reasoning are being marked | as such in a more fine-grained fashion rather than | leaving it as the default for the entire development. | Which makes it at least possible to add more constructive | content. | kmill wrote: | Yeah, I can imagine that, and I've used sigma types to | carry it out (`Subtype`s in particular, or perhaps custom | structures). Why does the proof in the sigma type have to | be constructive? | | Certainly this is the road to "dependently typed hell" | and it's good to avoid mingling proofs with data if you | can, but avoiding the law of the excluded middle doesn't | make any of that any easier. | | This is also a bit of a distraction since what I really | mean regarding TCS is that in my experience is that they | do not see the proofs themselves as being objects-with- | properties. They're perfectly happy with non-constructive | reasoning to support constructions. I am looking to be | convinced that constructive logic is worth it, but this | mathematician hasn't seen anything compelling beyond work | on foundations themselves. | | Regarding mathlib, could you say what in particular | you're talking about, since it's not ringing a bell, and | I follow the changelogs. | zozbot234 wrote: | > Regarding mathlib, could you say what in particular | you're talking about, since it's not ringing a bell, and | I follow the changelogs. | | There's been several commits where general appeals to | classical reasoning in some development have been | replaced by more fine-grained assumptions of decidability | for some objects, or marking uses of classical reasoning | in some specific proofs. This has not happened throughout | mathlib of course, just in some specific places. But it | might still show one way of making the development a bit | friendlier to alternate foundations. | | > ...but avoiding the law of the excluded middle doesn't | make any of that any easier. | | Constructivist developments are not about "avoiding the | law of excluded middle". They're more about leveraging | the way we _already_ structure mathematical proofs (and | mathematical reasoning more generally) to make it easier | to understand where simple direct reasoning _has_ in fact | been used, and a theorem can thus be said to be useful | for such purposes. If all we did was proofs by | contradiction and excluded middle, there would be no | point to it - but direct proof is actually rather common. | | > Why does the proof in the sigma type have to be | constructive? | | It doesn't, _if_ the constructibility part has been | stated separately (e.g. stating that some property is | decidable, or that some object is constructible). That 's | in fact how one can "write the program/algorithm and the | logic separately" in a principled way. | kmill wrote: | Many times, Decidable assumptions are added simply | because they appear in the terms in a theorem statement, | and doing so makes applying such a theorem easier. | There's the technical annoyance that Decidable instances | are equal but not necessarily defeq, so if the wrong | instances are present (such as the classical ones) then | you need simp-like automation to rewrite instances before | unifying. Maybe this is what you're seeing? | | There have also been attempts to make polynomials | computable, which peppers everything with decidability, | but it's not a good data type for real large-scale | computations, so it's not clear what the future is there. | Maybe the defeqs are worth it, I don't know. | | Re constructibility, this is all at too high of a level | to really know what you're talking about, or why it's | beneficial to write proofs themselves in a certain way. | I'm not really even sure what you mean by "constructive". | To me, I see no problem with writing a recursive | definition that requires a non-constructive proof of | termination by well-founded recursion -- is that | constructive to you? | brap wrote: | Why did the verification step take so long? I imagine just | verifying proofs is very efficient, no? | | Or do they mean formalizing the proof in Lean is what took | weeks? | adastra22 wrote: | They're talking about formalizing the proof (aka writing Lean | "code"). | spadufed wrote: | If anybody's interested in learning more about Lean, he's been | posting his experiences with the project over at | @tao@mathstodon.xyz | Affric wrote: | On your opening sentence against this one from the article: | | > feels like really one of the most basic things that we didn't | understand | | Nothing can prepare us for the depth of mathematics. | owlbite wrote: | 10 years ago a crack professor unit was defunded by an academic | tribunal for misconduct they didn't commit. These men promptly | escaped from a maximum teaching schedule to the research | underground. Today, still wanted by the activist community, they | survive as proovers of fortune. If you have a problem that no one | else can solve, and if you can find them, maybe you can hire The | A-Team. | Agingcoder wrote: | That's funny thanks :-) | 082349872349872 wrote: | My favourite bit of every episode was the iconic oplisis when | Prof. Dr. P.H.D. Baracus first adds structure preserving maps | to every object in sight then uses the snake lemma to weld them | together with connecting homomorphisms. "I pity the fool who | doesn't chase diagrams" | CobrastanJorji wrote: | I got a little sick of the recurring joke where Prof. Dr. | Baracus would complain about flat, two-dimensional, infinite | surfaces, and then the team would trick him into using one | anyway. | jfengel wrote: | Goddamn that's a long way to go to make that joke. | | Worth the journey, but those are brain cells I haven't used | in decades. I really wish I could garbage-collect some of | the 1980s pop culture neurons and put them to better use. | shrimp_emoji wrote: | I feel like this about Call of Duty maps. | sumtechguy wrote: | A lone mathematician battles the forces of evil with the | help of a virtually indestructible and artificially | intelligent assistant CGPT. | gosub100 wrote: | I love it when a discrete manifold comes together! | kleiba wrote: | What do you think, how long until we can give conjectures to a | specially trained LLM to come up with a proof? | brap wrote: | My uneducated hunch is that we're just a few years away from | "proof search" being a solved problem. But that's only a part | of mathematical research. | prmph wrote: | I think you are much too optimistic. | | Proof verification, though still hard to automate, seems at | least tractable. I'm not a mathematician by any means, but | something tells me automated proof search, in the general | case, would require solving the halting problem, which means | it is impossible even in principle. | dumbo-octopus wrote: | In the general case, proof search is obviously impossible, | for the reason you state. But the case need not be general. | A proof search machine that takes in a theorem and outputs | `"proof: ..." | "antiProof: ..." | "idk"` would be quite | powerful indeed. | gpm wrote: | def decider(_): print("idk") | | Done ;) | | --- | | More seriously, the benchmark would be something along | the lines of "and it is better at finding proofs than | humans are", like how chess engines haven't solved chess, | but they are so much better than humans that they always | win in a competition between the two. | | There's no good reason to think that humans are capable | of finding proofs that are hard to find algorithmically. | brap wrote: | >There's no good reason to think that humans are capable | of finding proofs that are hard to find algorithmically | | That's really all I was trying to say, not sure why I'm | getting downvoted | dellamonica wrote: | It doesn't require anything like that. | | Math proofs are of NP complexity. If you had access to a | non deterministic Turing machine you could enumerate all | possible proofs of a given length and check them all in | poly time. | | That does not say anything about LLMs though. Personally, I | believe they could be quite helpful to mathematicians in a | way similar to copilot for software programming. | explaininjs wrote: | > you could enumerate all possible proofs of a given | length | | That does not help us with proof search as we do not know | the target length. | bawolff wrote: | A proof longer than the size of the universe is also | pretty useless and probably not something we need to | worry about. | | Like i guess you are saying we couldn't really use such a | machine to determine whether a certain conjecture just | has a very long proof/disproof or is actually | undecidable. Which sure, but umm i think that is the sort | of problem most mathematicians would love to have. | | The real reason non-deterministic turing machines can't | help us is that they dont actually exist. | dellamonica wrote: | Of course it would, you would enumerate lengths too. If | the lengths need to be larger than polynomially bounded | then we can be sure it would never be found by a human | anyway. | Jweb_Guru wrote: | > Math proofs are of NP complexity. | | This is only true for very specific kinds of proofs, and | doesn't apply to stuff that uses CiC like Lean 4. This is | because many proof steps proceed by conversion, which can | require running programs of _extremely_ high complexity | (much higher than exponential) in order to determine | whether two terms are equal. If this weren 't the case, | it would be much much easier to prove things like | termination of proof verification in CiC (which is | considered a very difficult problem that requires appeal | to large cardinal axioms!). There _are_ formalisms where | verifying each proof steps is much lower complexity, but | these can be proven to have exponentially (or greater) | longer proofs on some problems (whether these cases are | relevant in practice is often debated, but I do think the | amount of real mathematics that 's been formalized in | CiC-based provers suggests that the extra power can be | useful). | zozbot234 wrote: | Yes, executing a calculation as part of a proof after | separately proving that it is the "right" calculation to | solve the problem is quite a common way of proceeding. | Even common things like algebraic simplifications of all | kinds (including solving equations, etc.) can be seen as | instances of this, but this kind of thing has notably | come up in the solution of famous problems like the | 4-color problem, or the Kepler conjecture - both of which | have been computer-verified. | dellamonica wrote: | This is all very interesting but it seems that we're just | taking different views on what is the instance size. If | it is the length of the theorem statement in some | suitable encoding and the goal is to find a proof, of any | possible length, then yeah, this is way too hard. | | I'm taking the view that the (max) length of the proof | can be taken as a parameter for the complexity because | anything too long would not have any chance of being | found by a human. It may also not be trusted by | mathematicians anyway... do you know if the hardware is | bug free, the compiler is 100% correct and no cosmic | particle corrupted some part of your exponential length | proof? It's a tough sell. | waynecochran wrote: | Neither RH nor P vs NP will succumb. | vakkermans wrote: | Finding a proof is a search in a large space, akin to searching | from abstractions to concretions. LLMs don't do anything like | this, and so you're looking at the planning problem again. It's | not clear to me how framing this particular problem as a | language problem is helpful in any case. | thisismyswamp wrote: | Playing chess & go is also search in a large tree of moves | leading to particular game states | pxeger1 wrote: | But AlphaGo etc don't use any kind of language-based AI, so | LLMs (which this thread was about) are no good. | thisismyswamp wrote: | The next step seems to be applying past advances in | reinforcement learning with modern transformer based | models | mattsan wrote: | Which multiple teams are working on - OpenAI (Q*), and | Meta just released a reinforcement learning framework | greysphere wrote: | The final state in chess is a single* state which yes, then | branches out to N checkmate configurations and then N*M | one-move-from-checkmates, and so on. (*Technically it's | won/lost/draw.) | | The equivalent final state in theorem proving is unique to | each theorem so such a system would need to handle an | additional layer-of-generalization. | sfink wrote: | Perhaps an LLM could produce a plausible-looking proof that | is completely and utterly incorrect. | | As a party trick. | generalizations wrote: | Wouldn't that be the point of pairing it with Lean? You | wouldn't get false positives. | Sharlin wrote: | Doesn't mean you'd get true positives either. Garbage in, | garbage out. | bawolff wrote: | At the same level of success as a mathematician? Never. | | At the same level of success as your average high school | graduate? Probably already can. | kaba0 wrote: | I assume an average high school graduate can solve a sudoku. | LLMs are very interesting but anything "thinking"-related is | not their strong suit. | auntienomen wrote: | Open conjectures are usually solved by inventing new | mathematical frameworks that embed the context of the existing | question. Inventing new definitions is trivial, but there's no | rules that constrain the space of moves. And inventing | _interesting_ new definitions is extremely hard. I wouldn't bet | on LLMs doing this ever. | auntienomen wrote: | What would be nice is an LLM to aid in translation between | mathematics and a theorem proving language like Lean. That | might fuel a revolution in how mathematics is done. | kaba0 wrote: | For trivial stuff, it might be able to even now (but those can | probably be solved algorithmically as well). For more complex | stuff, they are not even scratching the surface, I would | believe -- LLMs can't really do long, complex | thoughts/inferences, which are essential for coming up with | proofs, or even just to solve a sudoku (which they can't do -- | no, writing a program (which was likely part of its training | set) and executing that doesn't count). | llwu wrote: | I contributed a few trivial proofs to this project, and I tried | enlisting GPT-4, Copilot, Moogle, and Morph Prover to help out | (I did not try LLMStep or ReProver). | | Out of these: | | - GPT-4 answered maybe one or two questions about syntax | | - Copilot autocompleted maybe 0.5% | | - Moogle was helpful much more often but still often pointed to | the wrong theorem in the right file; in most of those cases I | could have just done go-to-def to get to the right file and | scroll through it | | Note that these results are actually very good! And Terence Tao | seems to have a positive opinion as well. But we're very far | away from automatically proving conjectures IMO. | | I will say that Lean has great metaprogramming facilities to | accept any type of AI innovation that might emerge. Currently I | find the most helpful tactics to be `aesop?`, which is a | classical search tactic that tries a bunch of substitutions and | simplifications; and `exact?`, when you know there's a trivial | proof but not sure of the name. But the UX for AI can be as | simple as, for example, typing the `gptf` or `llmstep` tactic, | and that is fully hooked into the Lean state (goals, | hypotheses, etc). So there's a lot of opportunity for | incremental progress in improving existing workflows (which | consist of dispatching trivial manipulations with | search/simplification tactics) | srcreigh wrote: | It will never happen. Some open math problems are equivalent to | the halting problem; and we know that finite computers can't | solve the halting problem. | steego wrote: | Check out this paper: | | https://leandojo.org/ | | People have already trained models to assist suggestion | tactics. They then linked it up to ChatGPT to interactively | solve proofs. | | In this scenario, ChatGPT asks the model for tactic | suggestions, applies it to the proof and uses the feedback from | Lean to then proceed with the next step. | | FYI, The programmatic interface to Lean was written by an | OpenAI employee who was on the Lean team a few years ago. | | Also, check out Lean's roadmap. They aspire to position Lean to | becoming a target for LLMs because it has been designed for | verification from the ground up. | | As math and compsci nerds contribute to mathlib, all of those | proofs are also building up a huge corpus that will likely be | leveraged for both verification and optimization. | | If AI can make verification a lot easier, then we're likely | going to see verification change programming similarly to the | way it changed electronics. | logtempo wrote: | Yesterday I was wondering if it's possible to have an AI that can | "recreate" the mathematics as we know it, and even go further and | explore unexplored paths. | | Though prduced by a video explaining how AI helped to find | optimization in the computation of matrix multiplication. | jmj wrote: | I am working on that for my PhD! ___________________________________________________________________ (page generated 2023-12-08 23:00 UTC)