DEV Community

Cover image for Thoughts on AlphaStar
Jon Rimmer
Jon Rimmer

Posted on

Thoughts on AlphaStar

Deepmind's presentation of its Starcraft 2 playing AI AlphaStar is fascinating, and I highly recommend watching it if you haven't. Even as somebody who hasn't played any real-time strategy games since Warcraft 2 and isn't a AI expert, I found it easy to follow and very entertaining. They do a great job of communicating the excitement and achievement of building an AI capable of beating the best human players in the world at something as complex as Starcraft 2.

However, (spoilers for the video if you haven't watched it) there's a twist in the tale. Throughout the video Deepmind's researchers stress that they've tried to make the competition between AlphaStar and the human player as fair as possible. The AI doesn't have superhuman reaction times, nor does it make, numerically speaking, an exceptionally high number of inputs. But one advantage is does have is that it receives input on the entire playfield simultaneously. Whereas a human player must move the camera around the map, only being able to see what's happening in one spot at a time, AlphaStar has access to data on every unit (albeit constrained by the "fog of war" effect that hides enemy units when there isn't a friendly unit nearby).

While the Deepmind researchers explain that AlphaStar has evolved an attention-like mechanism, whereby it chooses to spend more time focussing on different areas of the map, it sounds like a flimsy excuse for what is a clear advantage relative to humans. This seems to be confirmed at the end of the video, when the team have MaNa, the pro Starcraft 2 player the AI previously beat 5-0, play a live exhibition match against a new, prototype version of the AI, which does have to interface with the game through a camera-like mechanism.

While the researchers insist that, based on their internal estimates, the prototype AlphaStar is as strong as the previous version, the result of the match is rather different. MaNa plays a smart game, making sure to plant scouting units with visibility of the enemy's base, letting him tailor his strategy to whatever is doing. Later, he guarantees victory when the AI makes a mistake all too typical of non-human video game opponents. MaNa keeps an airborne unit just outside of the AlphaStar's main base, flying it in to harass some of the buildings, then flying it out to safety just before the enemy's forces can arrive to destroy it.

Each time MaNa employs this harassment strategy, AlphaStar dumbly turns around its entire army and sends it backwards, hopelessly trying to catch the airborne attacker before it disappears again, then sends them all away again. Anyone who's ever baited a programmed AI opponent in a game will recognise this situation. The AI is falling back on a pattern of behaviour without the meta-cognition necessary to recognise that it is making a mistake, that its opponent is exploiting its behaviour, and that it must formulate a new strategy.

In the aftermath of the video's release, there have been other concerns raised about the fairness of the of competition. For example, while the Deepmind researchers stressed that AlphaStar had an average number of "actions per minute" equivalent to or less than the pro human players, it isn't just the quantity but the quality of these actions that counts. Pro human players tend to spam inputs just to keep their fingers warm and moving, and are still limited by the laws of physics and biology in how quickly and accurately they can move and click the mouse.

AlphaStar, on the other hand, can make inputs with pixel perfect precision, for a huge number of separate units at any given moment. And while the AI's average actions-per-minute was low, it did spike up to near a thousand at key moments, when it was engaging in a large scale battle. This was most obvious in the second of the presented games between MaNa and AlphaStar, when the AI overwhelmed its opponent with a huge army of relatively low powered units, which it marshalled into an unstoppable three pronged attack from all sides. The opposing, human-controlled army was more powerful on paper, but even a pro of MaNa's ability couldn't exercise the precise control necessary to repel this kind of assault.

The hosts called AlphaStar's play during this battle, and at other points, "superhuman", and they were correct. But it was telling that didn't consider it to be displaying superhuman intelligence or strategy, just "microing" — the micro-scale control of individual units. More often they were confused and critical of elements of AlphaStar's play where, while engaging in some pro-level strategies, it also seemed to neglect others that would have made it stronger. The only major strategic novelty it displayed was consistently building more harvesting worker units than is considered optimal amongst pro players. Hardly revolutionary.

So, what conclusion can we draw from Deepmind's presentation? Well, they have definitely created an impressive system for training Starcraft 2 AI players, and this has resulted in the strongest bots ever seen. It seems likely that, with more data and more training, the agents they produce will probably be able to beat any human player most of the time. However, I think there are big questions remaining as to whether the agents produced by Deepmind's system so far possess much genuine intelligence, and whether any kind of fair competition is possible between the agents and human players. Right now, even if AlphaStar was capable of acting with a human level of intelligence, its superior level of control might mean it often wouldn't need to in order to win.

It's possible to imagine producing a new version of Deepmind's training system that more accurately replicates the limits on human player. As well as a camera system, agents could required to perform their inputs via a simulated cursor that would constrain them to an action rate and an accuracy equivalent to a pro player. But does it make sense to do so?

The thing is, one of the key reasons we use computers and other machines to perform tasks is that they can be faster and stronger and react more quickly than any human being. Perhaps the reasons animals evolved intelligence in the first place was to compensate for the limitations of biology. It would be ironic if, in order to produce artificial intelligence, we also need to artificially constrain machines from using most of the abilities that make them so powerful. After all, the concept of "fairness" is relative. From the machine's perspective, if it has one, it's doing nothing wrong in using its strengths against the weaknesses of its human opponents.

If, in order to win at Starcraft, it's more useful to exercise ultra-precise control of hundreds of units than to be smart, then maybe that's just how it is. And maybe the question isn't how we constrain the agent to play "fairly", but whether Starcraft is truly a good choice to train strong artificial intelligence.

Photo by Drew Graham on Unsplash

Top comments (0)