On AlphaZero

DeepMind, a subsidiary of Alphabet, the company spun off from Google to contain all the edgy research things (self driving cars, huge AIs, whatnot) has showed off another Go engine, that is also a chess engine. It’s extremely good. Here’s an article:


Perhaps unfortunately, there’s no real normal threshold to cross here – AlphaGo (an earlier but different engine by Alphabet) has already beaten the top human Go players. DeepBlue, IBMs monster chess engine, beat Kasparov at chess in 1997 – we’re 21 years out from top humans dominating at chess, and we throw out phones with enough computing power to school grandmasters because they’re annoyingly slow. So the news this time isn’t so much that AlphaZero won over someone new (though it beat all other top software), but that it did it in a different way.

When DeepBlue won, some people pointed out that we didn’t learn that much about intelligence, or humans, or chess. It was a fair enough criticism. Machines tend to be better than people at things on the basis that they have more power. Pushing harder, moving faster, completing more calculations more quickly, that sort of thing. In a few areas, that’s an easy (by modern standards) win – you can’t outrun a car, or outdig a backhoe. But you might be able to, say, paint a nicer looking flower, or determine another human beings emotions by sight more accurately. Or, formerly, play better chess, and read distorted letters to prove your non-robot-ness online. But those that fell like the last two didn’t fall to machines and software becoming smarter. They fell to shuffling the problem around until having more power thrown at it helped *just a tiny little bit*, and then throwing *mountains* of power at it. DeepBlue (and AlphaGo, the former Go champ) would pound through million after million of potential game board positions, trying to sort out which ones might hold a slight advantage. Humans don’t do that, and never did, and still play very well. Way better than any machine with a hard limit of evaluating perhaps a few dozen different moves, looking at most a few back-and-forth moves deep along each. There was, and still is, something about humans that still make them better, on an apples-to-apples level. It’s as though a human could run faster than a vehicle with the same size, weight, and energy intake – sure, we could still do what we’ve always done and just grunt and build a bigger one, and it’d be useful, but there would be a nagging questions of *why* we needed to. What is it that makes us faster? Well, nothing, because vehicles can move much faster using less energy, and we know why (for starters they don’t heat themselves to 98.7 degrees and stay that way 24/7 or instantly stop working forever), but with intelligence.. we kind of really don’t. We get the core mechanics of the deal, it’s a network of nerves, sending signals to other nerves, triggering some layers of chemicals with some back and forth. It’s not especially fast. It’s rather big, but not *that* big, not so that it’d win by sheer size. But the actual details are hard to see and rather obscure, too intricate and surrounded by too many parts that do nothing but not be in the way.

Algorithms improve, though. The best chess or Go engines couldn’t be what they are without some improvements in more generally evaluating positions without running them all to ground. That’s where AlphaZero has now made a serious advancement. It’s only trained against itself, and its evaluations are significantly shallower than prior engines, and it’s still better. In other words, it’s taken a significant step toward learning the way humans learn, and winning the way humans win. It’s not there yet, of course. The actual playing part of the engine is still closer to traditional engines than it is to humans, and the learning part is so computationally demanding that it’s not even worth trying to recalculate it into centuries of PC time or some other metric that’s supposed to sound impressive – it’s just above human visualization stupid lots of power. But even so, it’s a step *toward* functioning more like a human. It used to be that human intelligence had a 1,000,000 to 1 advantage over brute computational power, and we built winning machines by making them 1,000,001 times bigger. AlphaZero takes a small step in the other direction, throwing the power behind tuning up a system that can only needs to be 100,000 times bigger, and still win. And while doing so, it also manages to somewhat highlight what is different about a top human player (intuition, elegance, insight) as opposed to a robot (unyielding borg-like repetition, cold, unimaginative depth-and-precision strong arming).
This is still not the type of path that we can all explore, not with this level of demands. But it’s going to be very interesting to see what else it can do, when the same tactics get to hit other problems we solve by sheer force. Integrated circuit designs, road network routing, street light timings, turbine blade shapes, stealth airfoils, six-axis-unstable flight.. there’s a lot of things we build that are only possible because we can crunch numbers at them until something works. Because human imagination and intuition isn’t enough to “see” the elegant, clever, beautiful solution, and machines can only find the cold, pragmatic, by-the-numbers, sufficient one. What we see here is a step toward a remedy for that, a sort of finder of solution finders, and as a proof-of-concept finding at least a few that are much more like the ones we wish we could build rather than the best we can do while working on abstract problems that are too complex to reel back into human visionary space no matter how we try to reformulate them.

Leave a Reply