[HN Gopher] Minerva: Solving Quantitative Reasoning Problems wit...
       ___________________________________________________________________
        
       Minerva: Solving Quantitative Reasoning Problems with Language
       Models
        
       Author : guyga
       Score  : 29 points
       Date   : 2022-06-30 17:22 UTC (5 hours ago)
        
 (HTM) web link (ai.googleblog.com)
 (TXT) w3m dump (ai.googleblog.com)
        
       | alphabetting wrote:
       | Pretty insane. I thought AI had a ways to go before it could
       | solve problems like these. https://minerva-
       | demo.github.io/#category=Algebra&index=1
        
       | sharemywin wrote:
       | Limitations
       | 
       | Our approach to quantitative reasoning is not grounded in formal
       | mathematics. Minerva parses questions and generates answers using
       | a mix of natural language and LaTeX mathematical expressions,
       | with no explicit underlying mathematical structure. This approach
       | has an important limitation, in that the model's answers cannot
       | be automatically verified. Even when the final answer is known
       | and can be verified, the model can arrive at a correct final
       | answer using incorrect reasoning steps, which cannot be
       | automatically detected. This limitation is not present in formal
       | methods for theorem proving (e.g., see Coq, Isabelle, HOL, Lean,
       | Metamath, and Mizar). On the other hand, an advantage of the
       | informal approach is that it can be applied to a highly diverse
       | set of problems which may not lend themselves to formalization.
        
         | gwern wrote:
         | ('Sure, it can now read and write natural language with LaTeX
         | math and do Python programming, solving many challenging
         | problems from elite college STEM curricula autonomously,
         | offline from large natural datasets, without any hand-
         | engineering to teach it the specific algorithms, but you can't
         | easily _verify_ it, now can you? So overhyped. ')
        
         | sharemywin wrote:
         | Don't get me wrong it's pretty amazon what it can do but, I get
         | the impression there's no confidence in the answers so I'm not
         | really sure what to do with it. As with a lot of these systems
         | how cherry picked were the answers
        
           | alphabetting wrote:
           | some interesting stuff from one of the authors. could only
           | solve a third of MIT undergrad problems but did well on other
           | tests.
           | 
           | https://twitter.com/alewkowycz/status/1542559200051593219
        
           | nharada wrote:
           | If you can create a system that presents a limited number of
           | novel solutions a human could double check each one.
        
       ___________________________________________________________________
       (page generated 2022-06-30 23:01 UTC)