In Two Moves, AlphaGo and Lee Sedol Redefined the Future (2016)

34 points by neamar 6 days ago

redbell 2 days ago

I consider AlphaGo - The Movie [1] to be a timeless classic that will never feel outdated. In my opinion, it surpasses even Hollywood productions, despite being based on true events and filmed live with real people. I'm ranking it as #2, though, because I still believe Steve Jobs' 2007 iPhone presentation [2] is the greatest live tech event ever captured on film. Hearing the crowd screaming when seeing some tricks and techniques ( eg. slide to unlock, pinch to zoom and scrolling up) on how to use the phone does really triggers some haptic feedback in my heart because we are now so used to these tricks that were pure magic back then.

______________________________

1.https://www.youtube.com/watch?v=WXuK6gekU1Y

2.https://www.youtube.com/watch?v=VQKMoT-6XSg

nom 2 days ago

Agreed, I've watched the Alpha Go movie, I think, 3 times by now. It elicits strong emotions in me which I get very rarely from a movie or story.
I think it's due to the subject matter and people being very relatable to me. And it's real, filmed while it happened, instead of some madeup or retold story.

canistel 2 days ago

Well, Bridge remains unconquered, although it is unclear whether it is because of disinterest or incapability. As I have highlighted before, the day a computer false-cards will be the day. (False-carding - playing a certain card with the primary intention of deceiving the opponents and forcing an error)

IlikeKitties 2 days ago

Does Bridge have card draw from a randomized deck? Because that's most likely the issue. I'm facing similar problems when trying to build something that plays Magic The Gathering like games reasonably competent. The combinatorics explosion and dealing with bluffing/hidden knowledge is really a tough nut to crack. My current guess is that you need something like monte carlo reinforcement learning to do it.
Forcing an Error is an especially hard case because in machine vs machine matches both sides would be aware that something could force an error and would therefore not fall for it.
gsf_emergency 2 days ago

How about the slightly easier* Poker? OpenAI seem to be mildly interested
https://xcancel.com/polynoamial?lang=en
https://arxiv.org/abs/2301.09159
(To be fair, re (card) games: I'm also only interested in seeing Cyborg-on-Cyborg action. Lee vs a-G almost qualified :)
- Cthulhu_ 2 days ago
  
  I need to ask, when playing against AI players in poker games, are they fair (= work on the same sets of cards, are not aware of your hand) or do they get to cheat?
  (I played a MTG game years ago and it was not fair, the opponent's deck was not shuffled but they always had cards that provided a certain experience)
  - evandijk70 2 days ago
    
    They are indeed fair. The strongest poker bots are not AI in the way it is commonly defined. From my understanding they calculate the nash-equilibrium for a simplified game and extrapolate that to the full game.

bondarchuk 2 days ago

Would be nice if they could.. you know.. show the moves they're talking about.

tmtvl 2 days ago

The games can be viewed on gokifu: <http://gokifu.com/player/Alphago>
They're ordered by date from newest to oldest, so it's the 3rd and 4th games v Lee Sedol from the top down.

Xcelerate 2 days ago

There are a lot of parallels between rule-based games like Go and rule-based formal systems like ZFC. It’s interesting that the same techniques used for AlphaGo have not worked nearly as well for finding proofs of famous open problems that we suspect are both 1) decidable within ZFC and 2) have a “reasonable” minimal proof length.

What aspect of efficiently exploring the combinatorial explosion in possibilities of iterated rule-based systems is the human brain still currently doing much better than machines?

brilee 2 days ago

https://www.moderndescartes.com/essays/gnugo_to_agz/

I happen to have recently written up a longer history of Go AI. If you're wondering about what is special about Go in particular or what generalizes to other problems, give it a read.

russellbeattie 2 days ago

Coincidentally, I just watched the hour long documentary that DeepMind made about the match [1]. It talks a lot about the two moves - though not really in detail.

To a non-go player like myself, both moves 37 and 78 seemed completely arbitrary. I mean, much of the video talks about how it's impossible to calculate all the future moves like in chess, yet move 37 of a possible ~300 move game is called out as genius, and move 78 is a God Hand.

For the layman like myself, it seemed a bit inconsistent.

The thing that made me smile was how history repeated itself. Sedol predicted a 5-0 win against the program. Kasparov was pretty cocky as well in the 1990s. You'd think someone would have warned him! "Hey Sedol. Cool your jets, these guys wouldn't be spending so much money just to embarrass themselves."

DeepMind was definitely way more polite than IBM, so that was good to see. The Deep Blue team were sorta jerks to Gary.

1. https://www.youtube.com/watch?v=WXuK6gekU1Y

lawn 2 days ago

> I mean, much of the video talks about how it's impossible to calculate all the future moves like in chess, yet move 37 of a possible ~300 move game is called out as genius, and move 78 is a God Hand.
Every move is a choice of ~300 possibilities, and you need to calculate far ahead to know if it's a good move or not, so the number of choices you have to explore is much greater than what it seems.
oniony 2 days ago

If you think of the Go board as a battlefield and the stones as troops, you may get a sense of it. You're trying to secure large areas of the board. Do you spread out your forces at the risk of spreading them too thinly, or do you build solid walls with them at the risk of definitively securing only a small area.
In between these two extremes is the dance where the elegance happens. Large, seemingly secure areas get split into two. Multiple, separate battles grow and merge into larger ones. A single, well placed stone earlier in the game could prove pivotal as a battle creeps towards it.
giraffe_lady 2 days ago

I don't know why you'd expect to be able to read the board as a non-player. If I watch a surgeon at work a lot of the individual motions are inscrutable to me. It's like looking at sheet music if you don't play an instrument. You just don't have the mental schema to see what is interesting about even the most interesting parts.
I was watching this game with my go club and we all instantly saw the significance of 37, it was audible in the room. 78 felt tangibly different, some of us immediately read it as a clear misplay, some were taking longer to come to any conclusion, just puzzled. Our most experienced player, at the time 5 dan, gasped when he got it. But it still took him time to even intuit what it was doing. Now that it is well understood, moves of that type are common even in intermediate level play. Changed the game forever.
- RetroTechie 2 days ago
  
  > Now that it is well understood, moves of that type are common even in intermediate level play. Changed the game forever.
  That's an important takeaway from the AlphaGo saga:
  It played moves that (at the time, for human players) seemed weird. And while playing those, outperformed humans.
  But as understanding of how/why of such moves grew, it showed humans new ways of doing things. And in doing so, become better players themselves.
  AlphaGo broke new ground, humans followed. And like you said: changed the game forever.
  Also, the subtlety of what makes a win:
  Humans, before AlphaGo: try & grab as much territory as possible to beat your opponent.
  AlphaGo: just try to grab more territory than opponent (so, not necessarily much more). End up with only 1 point advantage = still a win.
  Different viewing angle, different strategy, different outcome.
- russellbeattie 2 days ago
  
  You misunderstood my criticism. I have zero idea about Go, and I know it.
  What I would have liked is for the video to take a minute to explain how a single move so early in a game was immediately obvious to players as amazing, given so much focus was put on the fact that there are more move options than atoms in the universe.
  Let me rephrase: Mathematically, what was it about move 37 that reduced the quintillions+ of possible outcomes down to a perceived guaranteed win?
  My assumption is that there are far fewer combinations of practical moves, which constrains the calculations considerably. I would have liked to have known more about that.
  - giraffe_lady a day ago
    
    Oh, you're right I did misunderstand you, sorry.
    I don't think there's really a framework for that sort of analysis yet. Go players talk about influence and structure but they aren't thinking of a move shrinking the problem space in that way, even though of course it does.
    And mathematical analysis has so far mostly (afaik) been about the broader game. Trying to use computation to understand the value of individual moves in this way is pretty much exactly the dead end that caused deepmind to wind up using the approach they did. An approach that certainly wins games, but so far it has been up to normal go players to explain why, and they use traditional go player tools to do so.
    If you find anything let me know bc it's super interesting. But I think what you're looking for is an as-yet-unwritten math or CS thesis written by a serious go playing phd candidate.

sevg 2 days ago

https://archive.is/EtWDJ

bpodgursky 2 days ago

Interesting that Lee Sedol losing at Go was the big opening act in the modern AI wave, but it ended up coming from a completely different technology that has effectively faded into the background.

fancyfredbot 2 days ago

They used deep neural networks, reinforcement learning, and Monte Carlo tree search. All except the MCTS are critical components of modern LLMs. MCTS is a form of planning which you can argue has parallels to "reasoning" models, although that's pretty tenuous I admit.
mastazi 2 days ago

> completely different technology that has effectively faded into the background.
If by that you mean reinforcement learning, that's not the case; e.g. see https://arxiv.org/abs/2501.12948
brian_cloutier 2 days ago

how so?
modern post-training uses RL and immense amounts of synthetic data to iteratively bootstrap better performance. if you squint this is extremely similar to the AlphaZero approach of iteratively training using RL over data generated through self-play