Life In 19x19 :: AlphaGo selfplay

White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays.

Deepmind said that 7.5 with Chinese rule is as good as it gets.

Mef wrote:

White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays.

If changing it by one point, it may just flip similarly in black's favour. Deepmind said that 7.5 with Chinese rule is as good as it gets.

alphaville wrote:

Deepmind said that 7.5 with Chinese rule is as good as it gets.

It's not clear to me in my head that Deepmind would really know the appropriate komi (for certain). Changing the komi could have significant impact on game winning probabilities from a given board position, and could result in an altered strategy. So all of the training that's been happening through self-play might make the most sense having 7.5 komi.

It could very well be that 7.5 komi is correct, but I would think that this should be investigated more scientifically if we really want an answer to this question.

Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis.

Bill Spight wrote:

Let's call the version of AlphaGo trained with komi of 7.5 AlphaGoX. Then I agree that it's likely that AlphaGoX vs. AlphaGoX probably has closest to 50:50 win rate using komi of 7.5 points.

But would a different type of AlphaGo that plays different moves have developed if it were trained with komi of, say, 10.5? The value network would have developed differently, probably. Let's call that hypothetical program AlphaGoY.

So an experiment where you put AlphaGoY vs. AlphaGoY might end up having games closest to 50:50 with a different komi than 7.5. Because AlphaGoY plays different types of moves than AlphaGoX... Isn't that possible?

Bill,
I think 7.5 komi may very well be correct komi - just want to point out that testing with current version of AlphaGo is not necessarily rigorous.

For example, maybe AlphaGo trained with 10.5 komi plays very aggressively as black to overcome the point difference. You come up with a different program, so it remains possible that black wins more often against itself with this aggressive strategy.

Seems unlikely to me, but I just feel a single version of AlphaGo may not be qualified to generally prove correct komi.

Some trivial observations:

Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end.
Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game.

mistakenot wrote:

Some trivial observations:

Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end.
Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game.

When I was starting out, I noticed that the number of captured stones was a pretty good predictor of the winner of a pro game. The player who captured more stones usually lost. (That's captured stones, not dead stones. It may matter.) Not that I made too much of that, and was not sure at all if it was generally true, but I certainly had no fear of sacrificing stones.

Until/unless we have a better description of what is meant by AlphaGo vs AlphaGo I don't think we should draw conclusions. The same program, yes, in the sense of program vs program. But the same values in the neural nets the programs were emulating? I thought these self play games were being used to train the neural net (result in the altering of cell values) so I would expect the sets of those values to be slightly different << current "best" set of values vs set that MIGHT be better >> If that is the case, we can't speak of these games indicating "white favored at this komi" unless we know that the "current" and "trial" nets were randomly assigned colors.

Bill Spight wrote:

mistakenot wrote:

Some trivial observations:

Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end.
Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game.

I suppose it takes a little bit of the sting out of losing ko fights as well (=

Author:	aeb [ Sat May 27, 2017 5:37 am ]
Post subject:	AlphaGo selfplay
DeepMind announced that they would publish 50 more selfplay games by AlphaGo, and published 10 today. They can be found on http://homepages.cwi.nl/~aeb/go/games/games/AlphaGo/ .

Author:	Mef [ Sat May 27, 2017 8:59 am ]
Post subject:	Re: AlphaGo selfplay
White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays.

Author:	alphaville [ Sat May 27, 2017 3:51 pm ]
Post subject:	Re: AlphaGo selfplay
Mef wrote: White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays. If changing it by one point, it may just flip similarly in black's favour. Deepmind said that 7.5 with Chinese rule is as good as it gets.

Author:	Kirby [ Sat May 27, 2017 8:54 pm ]
Post subject:	Re: AlphaGo selfplay
alphaville wrote: Deepmind said that 7.5 with Chinese rule is as good as it gets. It's not clear to me in my head that Deepmind would really know the appropriate komi (for certain). Changing the komi could have significant impact on game winning probabilities from a given board position, and could result in an altered strategy. So all of the training that's been happening through self-play might make the most sense having 7.5 komi. It could very well be that 7.5 komi is correct, but I would think that this should be investigated more scientifically if we really want an answer to this question.

Author:	Mef [ Sat May 27, 2017 9:18 pm ]
Post subject:	Re: AlphaGo selfplay
alphaville wrote: Mef wrote: White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays. If changing it by one point, it may just flip similarly in black's favour. Deepmind said that 7.5 with Chinese rule is as good as it gets. This is what more or less what I was referring to about idiosyncrasies of AlphaGo....for top pros, going between 5.5 and 7.5 might mean that the average winning percentages shift slightly toward white (and we aim to get it as close to 50% as we can).....For AlphaGo (who would be extremely consistent in play) it might be that anything other than a 100% winrate suggests there is reasonable parity.

Life In 19x19 http://prod.lifein19x19.com/

AlphaGo selfplay http://prod.lifein19x19.com/viewtopic.php?f=10&t=14262	Page 1 of 2

Author:	Bill Spight [ Sat May 27, 2017 9:21 pm ]
Post subject:	Re: AlphaGo selfplay
Kirby wrote: alphaville wrote: Deepmind said that 7.5 with Chinese rule is as good as it gets. It's not clear to me in my head that Deepmind would really know the appropriate komi (for certain). Changing the komi could have significant impact on game winning probabilities from a given board position, and could result in an altered strategy. So all of the training that's been happening through self-play might make the most sense having 7.5 komi. It could very well be that 7.5 komi is correct, but I would think that this should be investigated more scientifically if we really want an answer to this question. Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis. But I doubt if the DeepMind team tried territory rules -- IIUC, even Zen does not use Japanese rules for training. But did they try Button Go ( See http://senseis.xmp.net/?ButtonGo ) with a 7 pt. komi and a 1/2 pt. button? I doubt it. And since button go scores, like those of territory scores, normally have 1 pt. differences instead of 2 pt. differences, they might well find a komi that yields winning odds closer to 50:50 than Chinese scoring with 7.5 komi.

Author:	luigi [ Sat May 27, 2017 9:23 pm ]
Post subject:	Re: AlphaGo selfplay
One of the commentators in the Ke Jie match said that, in self-play, AlphaGo won only 45% of the time with Black, which is part of the reason why Ke Jie asked to be White in the last game. I think Chinese rules should seriously consider using 7 komi together with the button to prevent ties. The winning probability should be the same that way as it is in Japanese rules with 6.5 komi. EDIT: Heh, of course Bill Spight beat me to it.

Author:	Kirby [ Sat May 27, 2017 11:07 pm ]
Post subject:	Re: AlphaGo selfplay
Bill Spight wrote: Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis. Let's call the version of AlphaGo trained with komi of 7.5 AlphaGoX. Then I agree that it's likely that AlphaGoX vs. AlphaGoX probably has closest to 50:50 win rate using komi of 7.5 points. But would a different type of AlphaGo that plays different moves have developed if it were trained with komi of, say, 10.5? The value network would have developed differently, probably. Let's call that hypothetical program AlphaGoY. So an experiment where you put AlphaGoY vs. AlphaGoY might end up having games closest to 50:50 with a different komi than 7.5. Because AlphaGoY plays different types of moves than AlphaGoX... Isn't that possible?

Author:	Bill Spight [ Sun May 28, 2017 6:26 am ]
Post subject:	Re: AlphaGo selfplay
Kirby wrote: Bill Spight wrote: Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis. Let's call the version of AlphaGo trained with komi of 7.5 AlphaGoX. Then I agree that it's likely that AlphaGoX vs. AlphaGoX probably has closest to 50:50 win rate using komi of 7.5 points. But would a different type of AlphaGo that plays different moves have developed if it were trained with komi of, say, 10.5? The value network would have developed differently, probably. Let's call that hypothetical program AlphaGoY. So an experiment where you put AlphaGoY vs. AlphaGoY might end up having games closest to 50:50 with a different komi than 7.5. Because AlphaGoY plays different types of moves than AlphaGoX... Isn't that possible? If I understand you correctly, don't we have the example of the development of komi in go history? Up until the mid-20th century, players trained on no komi games. You can see the difference in early go strategy. Black tended to play conservatively, while White played enterprisingly, to try to catch up. So was typically a kakari, and Black typically played first in three corners. According to go theory at that time, that gave a theoretical advantage to Black, but White felt the need to complicate the game. With the advent of komi we saw the rise in popularity of parallel fuseki. On the assumption that the first four moves should be in an open corner, it is easy to show that a parallel fuseki is correct (even if a diagonal fuseki is, also), because each player can guarantee a parallel fuseki. Not that the early White kakari disappeared. Even Go Seigen recommended it in certain situations in his 21st century go writings. The 4.5 komi soon proved to be too small. It took a long time, but the Japanese finally adopted a 6.5 komi, after decades of playing with a 5.5 komi. (Even in the 1970s results with both a 4.5 komi and a 5.5 komi suggested a 6.5 komi, as an article in the AGA Journal showed.) Ing adopted a 7.5 komi by the early '80s. For some time there was a question whether even the 7.5 komi was enough. (Practical komi tends to increase with the strength of the players, up to the theoretical komi.) How much difference does 2 points make? Apparently not much. Despite being trained to a 4.5 komi, the median results of Japanese pros tended to a 1.5 - 2.5 win for Black. With the change to 5.5 komi, that became a 0.5 - 1.5 win for Black. In the time since the rise of the parallel fuseki, has there been any strategic change in play because of changing komi? Even with the higher komi, Go Seigen felt that White should make the game difficult for Black. Maybe he was an old man living in the past, but pros still valued his insights and advice. Now, along comes AlphaGo, the strongest go player yet. It trained on a 7.5 komi, but would its practical results in self play suggest a komi of 9.5, even as the practical results of pros with a komi of 4.5 suggested a komi of 6.5? Why not, if the theoretical komi is greater than 7.5? (Komi by Chinese rules tends to shift by 2 point increments.) No, White has the advantage in AlphaGo vs. AlphaGo games with 7.5 komi, which suggests, if anything, that a 5.5 komi might be better. Did the DeepMind team train a version of AlphaGo on a 5.5 komi? Maybe, but I kind of doubt it. Why bother? But I feel sure that they would not make any comments about komi unless they had millions of AlphaGo self-play games with a 5.5 komi. Is AlphaGo so brittle that training on a 7.5 komi would lead to relatively poor play at a 5.5 komi? I doubt it. Human pros were not so brittle with a 4.5 komi. They could have jumped to a 6.5 komi easily in the 1970s, just as they jumped to a 7.5 komi in the 1980s when they played by Ing rules. Did AlphaGo, as White, find some new strategies to make the game more difficult for Black? I suspect so. Anyway, the main advance of AlphaGo over current pros seems to be in the realm of strategy. Much food for thought.

Author:	aeb [ Sun May 28, 2017 7:02 am ]
Post subject:	Re: AlphaGo selfplay
DeepMind announced that they would publish 50 selfplay games by AlphaGo, and published the second batch of 10 today. They can be found on http://homepages.cwi.nl/~aeb/go/games/games/AlphaGo/ both as tar-file and as separate games.

Author:	pookpooi [ Sun May 28, 2017 7:19 am ]
Post subject:	Re: AlphaGo selfplay
The other thread is so full of game record it's hard to keep conversation. I count black win 12 out of 50 games Only 24% If these games are not handpicked then.... Well, it's human game anyway, to change komi has to be done by human. But seems like many pro are think the same as AlphaGo on this matter.

Author:	aeb [ Sun May 28, 2017 7:26 am ]
Post subject:	Re: AlphaGo selfplay
DeepMind announced that they would publish 50 selfplay games by AlphaGo, and in fact published the remaining 40 today. They can be found on http://homepages.cwi.nl/~aeb/go/games/games/AlphaGo/ both as tar-file and as separate games. (The situation was messy. Their webpage did not work under Firefox or Chrome on Linux or MacOs. Wget at first worked, and later got "403 Forbidden", and then worked again. Looks like the DeepMind people struggled to make this work as intended. Maybe got flooded with complaints and released all?)

Author:	Kirby [ Sun May 28, 2017 8:02 am ]
Post subject:	Re: AlphaGo selfplay
Bill, I think 7.5 komi may very well be correct komi - just want to point out that testing with current version of AlphaGo is not necessarily rigorous. For example, maybe AlphaGo trained with 10.5 komi plays very aggressively as black to overcome the point difference. You come up with a different program, so it remains possible that black wins more often against itself with this aggressive strategy. Seems unlikely to me, but I just feel a single version of AlphaGo may not be qualified to generally prove correct komi. A more rigorous test might be to train different versions of AlphaGo each optimized for different Komi values, and see which version had closest to 50% winrate when playing against itself. Even then, it's unclear how long to train each version to make a fair experiment. Or maybe I just don't see it in my head :-p

Author:	mistakenot [ Mon May 29, 2017 7:24 am ]
Post subject:	Re: AlphaGo selfplay
Spreadsheet with basic stats for the 50 games: https://docs.google.com/spreadsheets/d/ ... plIGo/view Some trivial observations: The longest game was #33 (346 moves). The shortest game was #12 (180 moves). Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end. Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game. In contrast, games #13 and #40 each saw only 4 stones captured. Game #13 was relatively short (182 moves) but #40 was moderately long (266 moves) which meant it had the lowest fraction of captured stones among the set (1.5%). The game with the most stones on the board at the end was #5 (283 stones, after 307 moves and 24 captures). Also, if needed here's another set of mirrors for the self-play games (ZIP archive) as well as links to download the SGFs directly from DeepMind: https://www.reddit.com/r/baduk/comments ... e/di5nwtl/ Incidentally, I noticed that most of DeepMind's SGFs were created with CGoban 3, except some from the last batch (#42, 44, and 46-50) which were created with GoGui 1.4.9. Doesn't make much difference, except the SGFs created by GoGui are slightly more compact than SGFs of similar length created by CGoban (because CGoban puts each move on a separate line).

Page 1 of 2	All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/

Author:	Bill Spight [ Mon May 29, 2017 8:32 am ]
Post subject:	Re: AlphaGo selfplay
Kirby wrote: Bill, I think 7.5 komi may very well be correct komi - just want to point out that testing with current version of AlphaGo is not necessarily rigorous. For example, maybe AlphaGo trained with 10.5 komi plays very aggressively as black to overcome the point difference. You come up with a different program, so it remains possible that black wins more often against itself with this aggressive strategy. Can't you apply that argument to players of yore who made overplays as White to overcome the lack of komi? Quote: Seems unlikely to me, but I just feel a single version of AlphaGo may not be qualified to generally prove correct komi. I don't think that we can prove correct komi.

Author:	Bill Spight [ Mon May 29, 2017 8:40 am ]
Post subject:	Re: AlphaGo selfplay
mistakenot wrote: Some trivial observations: Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end. Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game. When I was starting out, I noticed that the number of captured stones was a pretty good predictor of the winner of a pro game. The player who captured more stones usually lost. (That's captured stones, not dead stones. It may matter.) Not that I made too much of that, and was not sure at all if it was generally true, but I certainly had no fear of sacrificing stones.

Author:	Mef [ Mon May 29, 2017 8:58 am ]
Post subject:	Re: AlphaGo selfplay
Bill Spight wrote: mistakenot wrote: Some trivial observations: Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end. Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game. When I was starting out, I noticed that the number of captured stones was a pretty good predictor of the winner of a pro game. The player who captured more stones usually lost. (That's captured stones, not dead stones. It may matter.) Not that I made too much of that, and was not sure at all if it was generally true, but I certainly had no fear of sacrificing stones. I suppose it takes a little bit of the sting out of losing ko fights as well (=

Author:	Mike Novack [ Mon May 29, 2017 9:25 am ]
Post subject:	Re: AlphaGo selfplay
Until/unless we have a better description of what is meant by AlphaGo vs AlphaGo I don't think we should draw conclusions. The same program, yes, in the sense of program vs program. But the same values in the neural nets the programs were emulating? I thought these self play games were being used to train the neural net (result in the altering of cell values) so I would expect the sets of those values to be slightly different << current "best" set of values vs set that MIGHT be better >> If that is the case, we can't speak of these games indicating "white favored at this komi" unless we know that the "current" and "trial" nets were randomly assigned colors.

Author:	Bill Spight [ Mon May 29, 2017 11:15 am ]
Post subject:	Re: AlphaGo selfplay
Mike Novack wrote: Until/unless we have a better description of what is meant by AlphaGo vs AlphaGo I don't think we should draw conclusions. The same program, yes, in the sense of program vs program. But the same values in the neural nets the programs were emulating? I thought these self play games were being used to train the neural net (result in the altering of cell values) so I would expect the sets of those values to be slightly different << current "best" set of values vs set that MIGHT be better >> If that is the case, we can't speak of these games indicating "white favored at this komi" unless we know that the "current" and "trial" nets were randomly assigned colors. Hmm. My impression was that the same version of AlphaGo played itself, with suitable randomization of close choices. And then those games were used to train the neural networks, both for guessing moves and evaluating positions.

Author:	Bill Spight [ Mon May 29, 2017 11:20 am ]
Post subject:	Re: AlphaGo selfplay
Mef wrote: Bill Spight wrote: mistakenot wrote: Some trivial observations: Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end. Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game. When I was starting out, I noticed that the number of captured stones was a pretty good predictor of the winner of a pro game. The player who captured more stones usually lost. (That's captured stones, not dead stones. It may matter.) Not that I made too much of that, and was not sure at all if it was generally true, but I certainly had no fear of sacrificing stones. I suppose it takes a little bit of the sting out of losing ko fights as well (= Yeah, I usually lost, errr, sacrificed two or three groups per game. Fairly early on I heard the proverb, If you don't know ko, you don't know go, and welcomed ko fights.