Games Arcade Spurs Robot Evolution

The "good feeling" of success is called in neuroscience "dopamine reward system", and it reinforces neural successful connections: we thus become better, smarter, and faster, at solving problems.

We have moved several paces closer to machines that can learn following the recent publication of a research article in Nature by Google Deep Mind, the London-based Artificial Intelligence company. If this were biology simple bacteria would have evolved into worms. It took mother nature 2 billion years to perform that giant leap; it took us humans less than sixty years to go from simple programs to self-improving algorithms. The velocity by which we are developing intelligent machines is mind-boggling, and will become faster. Deep Mind's recent publication is an important watershed because it has laid the foundations for accelerated development of new AI systems.

Borrowing ideas from neurobiological research Demis Hassabis, the co-founder of Deep Mind, and his team created a semblance of a learning brain in a computer. Our brain is made up of neurons organised in hierarchical levels. Experiences strengthen, or weaken, connections between neurons. This modulation in the connectivity of our neural networks occurs in many ways. However, there is one mechanism that has been studied and understood quite well, and is based on a chemical that is naturally produced by the brain called dopamine. Every time we successfully achieve a goal small doses of dopamine are secreted that strengthen the neural connections that took part in preforming the actions that made us achieve that goal. The "good feeling" of success is called in neuroscience "dopamine reward system", and it reinforces neural successful connections: we thus become better, smarter, and faster, at solving problems.

Deep Mind simulated both mechanisms using artificial neural networks. They developed a learning algorithm, called "deep Q-network" that functions similarly to the dopamine reward system. And they applied their technology in playing 49 different arcade games of the classic Atari 2600 of the 1980s. Simply put, the algorithm measures which actions get the highest award by analysing two sets of inputs: the game score and the pattern of the pixels on four video game screens. After spending several hours on each game the algorithm was able to lean strategies for achieving high scores; and achieved the level of a professional human tester. Deep Mind's algorithm is significant because it is general-solver: the programmer does not need to handcraft, or tweak, a program to bear upon a given problem. The deep Q-network is one-size-fit-all, similar to a human video game player who learns and becomes better at any game by trail and error.

The next step would be to develop a memory of experiences for the computer, so it can transfer its acquired knowledge, just like we humans do. This is what makes us agile thinkers, for we do not tackle new problems from scratch but apply past experiences to explore shortcuts. We are masters of innovation because we are lateral thinkers. Although computers are still a long away from crying "eureka!" Deep Mind's breakthrough signifies that this moment is likely to come sooner that we think. Extrapolating the timeline of evolution on planet Earth, it took another 2 billion years for worms to become humans. If we are now half way towards human-level artificial intelligence then we should be seeing truly intelligent robots by the end of this century.

Close

What's Hot