AlphaGo Zero paper
Remarkable for many reasons. In a lot of ways it's much simpler than the original AlphaGo paper. A very simple loss function. No training data. Changed to ResNets. Less hardware (4 TPUs). And they achieve an Elo rating around 5000, after just a few days (and about 5 million games). Very remarkable.
