hngopher.com

       [HN Gopher] Mastering Stratego
       ___________________________________________________________________
        
       Mastering Stratego
        
       Author : beefman
       Score  : 97 points
       Date   : 2022-12-01 20:20 UTC (2 hours ago)
        
 (HTM) web link (www.deepmind.com)
 (TXT) w3m dump (www.deepmind.com)
        
       | [deleted]
        
       | beefman wrote:
       | The paper is sadly paywalled. I believe this is the preprint:
       | 
       | https://arxiv.org/abs/2206.15378
        
       | waprin wrote:
       | Great article. I played Stratego a lot as a kid and it always
       | felt simpler than chess, go , or poker so it's surprising it's a
       | much bigger game tree unless you stop and think.
       | 
       | I'm curious about the comparisons to poker. I know the hot
       | algorithm in poker solvers is counter factual regret
       | minimization. The article indicates that the feedback cycle is
       | too long for those algorithms to work but I'd be curious to learn
       | more about the relationship from CFR to what's tried here, if
       | any.
        
         | _HMCB_ wrote:
         | I played it too. My goodness what a blast from the past. To be
         | honest, none of my friends liked to play. I mostly played by
         | myself it seems. LOL.
        
       | sdwr wrote:
       | I remember seeing a version of the paper earlier in the year (it
       | talked a lot about getting the bot to be aggressive to avoid
       | stalemates).
       | 
       | Feels like the secret sauce has to be probability distributions
       | guessing what all the pieces are.
       | 
       | Bluffing in stratego _seems_ like it requires long-term planning
       | (if you move a 2 like a 10, you have to keep treating it like
       | that for the bluff to work).
        
       | dr_faustus wrote:
       | Call me a cynic but the fact that after almost 10 years of AI
       | hype we are still working our way down the list of popular board
       | games is a bit of a downer for me. I mean, having AIs to play
       | Stratego, Risk, Go, Diplomacy and what have you against sure is
       | nice. But there are literally billions of dollars spent on these
       | projects and I really come to the point where I just don't
       | believe anymore that the current AI approaches will ever
       | generalize to the real world, even in relatively limited scopes,
       | without the need for significant human intervention and/or
       | monitoring. What am I missing?
        
         | zaptrem wrote:
         | Have you tried this out yet? https://chat.openai.com/
         | 
         | It's been providing real value to me over the past day for
         | practicing Spanish, explaining Machine Learning concepts, and
         | doing fancy write-ups in LaTeX. And this one can't even use
         | Google yet! (other research teams have already created models
         | capable of doing so, it's only a matter of time until these
         | innovations are brought together in one place)
        
         | VHRanger wrote:
         | AI remains better than humans at anything that has well defined
         | rewards and small time gap between action and feedback
         | mechanism (either naturally, like poker, or by value function
         | engineering, like Go or Chess)
         | 
         | The problem here is that it's missing the "glue" to more real
         | world applications. This is where more humdrum software
         | engineering comes in.
         | 
         | Diplomacy in this is much more interesting than Stratego or
         | beating the next video game - it mixes cooperative game theory
         | with NLP and reinforcement learning.
        
         | runarberg wrote:
         | There was a time when I thought that maybe there was something
         | more to AI then a fancy statistical model when you need to fit
         | non-linear data. But I'm solidly on the belief now that AI is
         | precisely a very powerful statistical tool. I honestly think
         | there was never any real strategy of getting AI to anything
         | more then specialized learning for deeper inference using a lot
         | of computational power.
         | 
         | Don't get me wrong, using AI for that purpose is pretty amazing
         | (but can also lead to some sketchy results if you don't know
         | what you are doing[1]) but pretending it will lead to some
         | "general AI" is nothing but hype IMO. And teaching AI to play
         | these board games better then a grandmaster only serves to
         | increase that hype.
         | 
         | 1: https://www.vox.com/recode/2019/8/15/20806384/social-
         | media-h...
        
           | pbronez wrote:
           | I'm not an AGI fanboy. I agree that the current line of
           | inquiry (ie deep learning) won't get us there. I think
           | neurosymbolic reasoning is needed. That work is still
           | nascent, and worse, we don't have great ways to connect our
           | current paradigm to it.
        
       | clolege wrote:
       | It's interesting to watch the videos they link of deepmind
       | playing against the top-level Stratego masters [0]. I usually
       | find Stratego to be a bit of a dull game (less elegant and more
       | drawn out than Go and chess), but I'm a sucker for watching top-
       | level AIs play.
       | 
       | Its skills for bluffing are both fascinating and a bit scary.
       | 
       | [0] https://www.youtube.com/watch?v=HaUdWoSMjSY
       | https://www.youtube.com/watch?v=L-9ZXmyNKgs
       | https://www.youtube.com/watch?v=EOalLpAfDSs
       | https://www.youtube.com/watch?v=MhNoYl_g8mo
        
         | ep103 wrote:
         | Are these games against stratego masters? I'm watching the
         | first one, but it doesn't say who they're playing against
        
           | sdwr wrote:
           | Dont think the player pool is very deep, doubt there are many
           | masters around..
        
           | clolege wrote:
           | yep, top anonymized players
        
         | VHRanger wrote:
         | FWIW one of the big things poker AI taught humans is massive
         | overbets (eg. going all in for $200 over a $15 pot).
         | 
         | This is scary to do well in practice, because the
         | mathematically optimal bluff frequency approaches 50% as you
         | increase the overbet size.
        
           | clolege wrote:
           | Wow, that's crazy.
           | 
           | It seems like it would be easier for AI to do, since it
           | doesn't have any tells (it's easier to have a poker face when
           | you don't have a face at all).
           | 
           | I remember playing poker as a kid, and experimenting with
           | pretending like my cards were good/bad with body language. I
           | don't think that any professional players use that approach
           | (they just have sunglasses and a straight face), but I wonder
           | if AI could beat humans even more consistently if it
           | developed a way to convey tells and fake tells?
        
         | sdwr wrote:
         | Watched some of the first game. I'd bet stratego favors
         | defence, advantage to the AI that has no/minimal concept of the
         | value of time.
        
           | clolege wrote:
           | Yeah this is one of the reasons why I find it more dull than
           | chess.
           | 
           | There is an incentive to just _not_ move your pieces, so that
           | the other player thinks they 're bombs. As a result, players
           | only activate 2-3 pieces at a time.
           | 
           | In chess, on the other hand, you are constantly moving your
           | pawns to the other side to promotion, or otherwise trying to
           | activate/coordinate all of your pieces for an attack.
           | 
           | It makes me think that if deepmind was trained to _not lose_
           | instead of _win_ , then the top strategy might be shuffling
           | pieces and letting the enemy come to attack. No human would
           | ever have the patience to play that way though.
        
       | jstummbillig wrote:
       | Can anyone shed light on in what way this is more challenging
       | than the starcraft or dota agents, which also had to work with
       | imperfect information?
        
         | Tenoke wrote:
         | Starcraft and Dota benefit a lot from having good micro.
         | Stratego seems to be only macro. Micro is easyish for AI and
         | requires less long-term thinking to get benefits from.
        
           | adgjlsfhk1 wrote:
           | Starcraft especially only resembles a strategy game in GM
           | (and maybe high masters). Below that, the strategy is mostly
           | macro better so you have more units.
        
         | machina_ex_deus wrote:
         | Dota is a pretty local game. 70% tactics, 20% strategy. Maybe
         | 10% information. Yes you have warding game but for an AI with
         | no cost of looking at heroes inventories (humans need to waste
         | attention and move their map) AI already has huge advantage
         | over humans in the imperfect information part. Usually fighting
         | into the imperfect information is the bad choice.
         | 
         | Stratego is 40% information, 40% strategy, maybe 10% tactics.
         | If you know where is the flag it's trivial to win in almost all
         | situations. Fighting into imperfect information is literally
         | all the game.
        
           | ghostbrainalpha wrote:
           | Is Tactics the same thing as execution?
           | 
           | Like is it just the speed of your clicking? Or is it more
           | than that, like the most basic kinds of strategic decisons?
        
             | keithnz wrote:
             | in dota the tactics is to do with the execution of
             | abilities, often times in coordination with other agents in
             | execution of their abilities to get combo effects while
             | adapting to the situation as it unravels.
        
               | dtdynasty wrote:
               | As an avid dota player I wouldn't agree with your
               | characterization that 70% of dota 2 is your definition of
               | tactics. What I've noticed differentiates player MMR the
               | most is the strategy applied to each context. It's rarely
               | the execution that's the problem as you can gain such
               | overwhelming advantages through strategy.
        
               | machina_ex_deus wrote:
               | There's barely any long term strategy in Dota, only
               | meaningful strategic decisions are items and heroes. Even
               | ultimate usage has like 2 minute window of importance.
               | Wards too. And maybe the decision to push high ground
               | because of how many times games are lost because of it,
               | but it's the tactical errors usually making most of the
               | difference there.
               | 
               | What's your MMR, out of curiosity?
        
             | machina_ex_deus wrote:
             | It's from things like properly last hitting creeps, good
             | reaction timing, good reaction decisions, coordinating real
             | time actions with teammates in milliseconds resolution.
             | 
             | It's obviously not about clicking fast, but it is about
             | timing, sometimes 100 milliseconds reaction time make huge
             | difference in outcome. It is usually making decisions on
             | very small time scales. Do you retreat or continue? Use
             | ability or hold it? Can you overextend?
             | 
             | The only meaningful strategic decisions in dota (which you
             | have long time frame of deciding and effect the game for a
             | long duration) are draft (which AI doesn't really master,
             | they reduced the heroes pool to simplify) and item
             | purchases, and there are only a handful of them (~6) in an
             | entire game. Other decisions don't really have a long
             | "memory" time, a minute or two at the most. After two
             | minutes every other decision is just reduced to the
             | relative advantage between the teams.
             | 
             | There used to be one hero in Dota which made it a strategy
             | game instead (techies). But it was like playing a different
             | game and everyone hated it and it was effectively removed.
             | Techies was like playing stratego against chess players,
             | they obviously get pissed off by not playing what they
             | wanted.
        
               | dtdynasty wrote:
               | There are larger strategic decisions that are significant
               | in dota. Which area of the map to play, which objectives
               | are important and when, what type of fights will we win
               | (fast and bursty) and when will we take them. Often times
               | these are thought of at the beginning of the game and
               | effect gameplay throughout.
        
           | rosmax_1337 wrote:
           | Dota at the mid-casual and high-casual brackets (which is
           | where you find most players) is also a social game.
           | Establishing efficient leadership, communication and
           | cooperation in a game gives you a huge advantage. And the
           | low-casual and the pro levels you find it becomes more a game
           | of skill and strategy funnily enough.
           | 
           | (The old joke is that Dota is a 1 v 9 game, not a 5 v 5)
        
       | beefman wrote:
       | There's an extra space in the link to their code (at the end of
       | the article). The correct URL is:
       | 
       | https://github.com/deepmind/open_spiel/tree/master/open_spie...
        
         | ArtWomb wrote:
         | Wow! Thanks to DeepMind for OpenSpiel! Am looking forward to ai
         | experimenting with Stratego, Battleships & Hanabi ;)
        
       ___________________________________________________________________
       (page generated 2022-12-01 23:00 UTC)