recentpopularlog in

deepmind

« earlier   
AlphaGo Zero: Learning from scratch | DeepMind
A watershed moment. AlphaGo Zero learns to play the best Go in the world by starting from JUST the rules of Go and by playing against itself. No human intervention.
deepmind  ai  google  alphago  alphagozero  reinforcementlearning 
8 hours ago by drmeme
Intelligence artificielle : toujours plus puissant, AlphaGo apprend désormais sans données humaines
Dans un article publié mercredi 18 octobre par la prestigieuse revue scientifique Nature, les créateurs d’AlphaGo annoncent avoir mis au point une version considérablement plus puissante de leur programme, et surtout, qui est capable d’apprendre à jouer « sans rien savoir du jeu de go »
IA  AlphaGo  DeepMind 
2 days ago by sentinelle
AlphaGo Zero: learning from scratch • DeepMind
Demis Hassabis and David Silver:
<p>The <a href="https://deepmind.com/documents/119/agz_unformatted_nature.pdf">paper</a> introduces AlphaGo Zero, the latest evolution of AlphaGo, the first computer program to defeat a world champion at the ancient Chinese game of Go. Zero is even more powerful and is arguably the strongest Go player in history.

Previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go. AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play. In doing so, it quickly surpassed human level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0.

<img src="https://storage.googleapis.com/deepmind-live-cms/documents/AlphaGo%2520Zero%2520Training%2520Time.gif" width="100%" />

It is able to do this by using a novel form of reinforcement learning, in which AlphaGo Zero becomes its own teacher. The system starts off with a neural network that knows nothing about the game of Go. It then plays games against itself, by combining this neural network with a powerful search algorithm. As it plays, the neural network is tuned and updated to predict moves, as well as the eventual winner of the games.

This updated neural network is then recombined with the search algorithm to create a new, stronger version of AlphaGo Zero, and the process begins again. In each iteration, the performance of the system improves by a small amount, and the quality of the self-play games increases, leading to more and more accurate neural networks and ever stronger versions of AlphaGo Zero.</p>


This is mindblowing. OK, a limited rulespace - Go has fewer than most serious games - but utterly incredible to create the best Go player ever.

Though I was watching The Incredibles on Wednesday, where Mr Incredible is used to train better and better Omnidroids until it can kill him. It always feels like a subtle warning.
ai  go  deepmind 
3 days ago by charlesarthur

Copy this bookmark:





to read