The key thing about this process is that the neural network doesn’t even know whether it’s correctly identifying state/action pairs when it starts—it doesn’t know how to “read”—much less whether it has correctly interpreted the advice they convey (do you build near a river, or should you never build by a river?). All it has to go on is what impact its interpretation has on the outcome of the game. In short, it has to figure out how to read the owner’s manual simply by trying different interpretations and seeing whether they improve its play.
Despite the challenges, it works. When the full-text analysis was included, the success of the authors’ software shot up; it now won over half its games within 100 moves, and beat the game’s AI almost 80 percent of the time when games were played to completion.