[HN Gopher] An AI system for solving crossword puzzles that outp...
       ___________________________________________________________________
        
       An AI system for solving crossword puzzles that outperforms the
       best humans
        
       Author : DantesKite
       Score  : 44 points
       Date   : 2022-05-20 18:47 UTC (4 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | zwieback wrote:
       | My dad was a big crossword puzzler. I asked him if he thought
       | that if you pick one of two possible answers to the first clue
       | whether it would be possible to solve the entire puzzle one way
       | or another way. He sat down and created a series of puzzles with
       | "themes", e.g. "north", "south" or "Schiller", "Goethe", where
       | all the major words were from one or the other theme.
       | 
       | Anyway, it would be interesting what the AI would do with this,
       | would there be two hotspots in the solution space, one for each
       | variant?
        
         | mcherm wrote:
         | Also famously the November 5 1996 NYT puzzle where a clue about
         | the newly elected president could be solved either CLINTON or
         | BOBDOLE and all the crossing words had two solutions.
         | 
         | If they trained the AI on the NYT archive then they would have
         | the results of testing it on this one.
        
       | thom wrote:
       | Note for Brits that this isn't cryptic (dare I say 'real')
       | crosswords, but I assume it could be retooled for that.
        
         | tialaramex wrote:
         | American Crosswords are different in two key ways as I
         | understand it:
         | 
         | Firstly, all "serious" British crosswords are "Cryptic" ie once
         | you figure out what the answer is, it's apparent why that's the
         | correct clue, but figuring out the answer from the clue
         | involves lateral thinking and some skills learned from years of
         | staring at such clues.
         | 
         | e.g. Private Eye's crossword 726 (back in April), clue 23 down,
         | 
         | "He finally gets to penetrate agreeable person (relatively)
         | (5)"
         | 
         | The correct answer is "Niece". "Nice" can mean agreeable, the
         | final letter of "He" is E, and so by having the letter E
         | "penetrate" the word nice you produce "niece", a person who is
         | a relative.
         | 
         | [ and yes, Private Eye is a satirical magazine, the crossword
         | clues are, likewise, intended to make you a little
         | uncomfortable while you laugh ]
         | 
         | Secondly, British crosswords are arranged with black "dead"
         | squares between letters to produce more of a lattice, in which
         | many letters only take part in one word, as a result longer
         | answers are common
         | 
         | e.g. same crossword, clue 26 across is
         | 
         | "Figure on getting your teeth into our statistical revelations
         | (6,9)"
         | 
         | The answer was "Number Crunching".
        
           | jen729w wrote:
           | Brit here. I woke up one morning - I was 15, so this was in
           | the 90s - with the word 'microdot' in my head. The first
           | thought, clear as anything, as if it was painted across the
           | inside of my eyes. Microdot!
           | 
           | Puzzled, I didn't move and set about figuring out why.
           | Eventually I realised that I had solved, in my sleep, a
           | crossword clue that I had not even gone to bed thinking
           | about. I'd read it at my grandma's house earlier the previous
           | day.
           | 
           | Tiny picture makes computer work on time (8)
           | 
           | The brain is amazing. I'm not even any good at the cryptic
           | crossword!
        
         | dane-pgp wrote:
         | I'm reminded of an article I read about an AI that competed in
         | a crossword competition and one particularly difficult clue it
         | faced was "Apollo 11 and 12 [180 degrees]". I don't know if it
         | would be allowed as part of a cryptic crossword, but the number
         | of letters in the (words of the) answer were 8, 4.
         | 
         | The answer to that clue is included here:
         | 
         | https://www.uh.edu/engines/epi2783.htm
        
           | nilstycho wrote:
           | That would usually be considered an invalid cryptic clue.
        
         | jamespwilliams wrote:
         | For cryptic crosswords I've found
         | https://www.crosswordgenius.com/ impressive (once you get past
         | the kind of clunky UI)
        
       | mnd999 wrote:
       | Anyone else getting a bit bored with all these AI does some super
       | specialised task better than humans after enormous amounts of
       | training. It's not very interesting anymore.
       | 
       | Sure, it can do crosswords well but the average human that does
       | crosswords well can also do a zillion other things and this type
       | of AI is not getting us any closer to that.
        
         | joshcryer wrote:
         | But every specialized model like this is getting us closer to
         | "doing a zillion other things." By logic it is exactly one step
         | closer. The general AI agent will be composed of many such
         | models.
        
         | DantesKite wrote:
         | If you skim the paper, you'll realize what's most interesting
         | are the new techniques they developed to accomplish this,
         | advancing the field of machine learning in the process.
        
       | cinntaile wrote:
       | Now automatically send in the answers to the various weekly
       | magazine and newspaper competitions to get a passive prize
       | income.
        
       | ericwallace_ucb wrote:
       | Hi, I am the first author of this paper and I am happy to answer
       | any questions. You can find a link to the technical paper here
       | https://arxiv.org/abs/2205.09665.
        
         | mikeryan wrote:
         | Hey this is cool, I do the NYT Crossword every day. A few
         | questions.
         | 
         | 1. You mention an 82% solve rate. The NYT puzzle gets "harder"
         | each day Monday through Saturday. Do you track the days
         | separately? If so I'd be curious how much of the 18% unsolved
         | end up on Fridays and Saturday. (for anyone who doesn't know
         | the Sunday puzzle is outside of the M-Sat range since its a
         | bigger puzzle).
         | 
         | 2. Related to the above Thursday puzzles usually have "tricks"
         | (skipped letters and what not) in them or require a Rebus
         | (multiple letters in one space) - do you handle these at all?
         | 
         | 3. Is this building an ongoing model and getting better at
         | solving? Or did you have to seed it with a set of solved
         | puzzles and clues?
         | 
         | Sorry didn't have time to read the whole paper.
        
           | nickatomlin wrote:
           | Hi! I'm another author on this paper. To answer your
           | questions:
           | 
           | 1. Monday puzzles are the easiest for our model, and
           | Thursdays are the most difficult. You can see a graph of day-
           | by-day performance here:
           | https://twitter.com/albertxu__/status/1527704535912787968
           | 
           | 2. Our current system doesn't have any handling for rebuses
           | or similar tricks, although Dr. Fill does. I think this is
           | part of why Thursday is the hardest day for us, even though
           | Saturday is usually considered the most difficult.
           | 
           | 3. We trained it with 6.4M clues. As new crosswords get
           | published, we could theoretically retrain our model with more
           | data, but we aren't currently planning to do that.
        
             | sp332 wrote:
             | I don't suppose you gave more weight to more recent
             | puzzles? Is there a time period or puzzle setter that was
             | harder to solve because they favored an unusual clue type?
        
         | avrionov wrote:
         | Do you think your approach can be applied to other problems?
        
         | Imnimo wrote:
         | For handling cross-reference clues, do you think it would be
         | feasible in the future to feed the QA model a representation of
         | the partially-filled puzzle (perhaps only in the refinement
         | step - hard to do for the first step before you have any
         | answers!), in order to give it a shot at answering clues that
         | require looking at other answers?
         | 
         | It feels like the challenges might be that most clues are not
         | cross-referential, and even for those that are, most
         | information in the puzzle is irrelevant - you only care about
         | one answer among many, so it could be difficult to learn to
         | find the information you need.
         | 
         | But maybe this sort of thing would also be helpful for theme
         | puzzles, where answers might be united by the theme even if
         | their clues are not directly cross-referential, and could give
         | enough signal to teach the model to look at the puzzle context?
        
       | gardenfelder wrote:
       | https://github.com/albertkx/berkeley-crossword-solver
        
       ___________________________________________________________________
       (page generated 2022-05-20 23:00 UTC)