Google Deepmind's AI Model Learns to Play Multiple 3D Games and Understand Verbal Instructions

Deep learning models in AI have been around for decades, specializing in playing games and always aiming to win. However, Google Deepmind’s latest creation takes a different approach. Their model, named SIMA (scalable instructable multiworld agent), learns to play multiple 3D games like a human and strives to understand and act on verbal instructions.

Unlike typical NPCs or AI characters in games that can be controlled indirectly through in-game commands, SIMA does not have access to the internal code or rules of the game. Instead, it is trained using numerous hours of gameplay videos by humans. Data labelers provide annotations that help the model associate visual representations of actions, objects, and interactions. Additionally, videos of players instructing each other in the game were recorded to enhance the learning process.

For instance, SIMA might learn that a certain pattern of pixel movements on the screen corresponds to the action of “moving forward.” Similarly, when the character approaches a door-like object and interacts with a doorknob-shaped element, the model understands that it is “opening a door.” These learned associations enable the model to perform tasks or events that go beyond simple key pressing or object identification.

The training videos encompassed multiple games, including Valheim and Goat Simulator 3. The developers of these games were involved and provided consent for the use of their software. One of the primary objectives of the researchers was to determine if training an AI on one set of games would enable it to play others it hasn’t encountered before, a process known as generalization.

The answer is affirmative, but with certain limitations. AI agents trained on multiple games performed better when exposed to games they hadn’t encountered before. However, unique mechanics and specific terms in different games can still challenge even well-prepared AI models. Nonetheless, with sufficient training data, there is potential for the model to learn and adapt to these differences.

This is partly due to the fact that, despite varied in-game lingo, there are only a limited number of player actions that significantly affect the game world. Whether a player assembles a lean-to, pitches a tent, or summons a magical shelter, the underlying action is essentially “building a house.” Researchers have created a map of several dozen recognized primitives that the SIMA agent can perform or combine, revealing fascinating insights into the model’s capabilities.

In addition to advancing the field of agent-based AI, the researchers aspire to develop a more natural and cooperative game-playing companion than the current rigid, hard-coded characters. The idea is to have SIMA players alongside human players, allowing them to provide instructions and work together as a team.

As SIMA perceives the game solely through the pixels on the screen, it learns and adapts in a manner similar to humans. This adaptive nature enables the model to exhibit emergent behaviors, adding an element of unpredictability to its interactions.

A commonly used method for training agent-type AIs is the simulator approach, where an unsupervised model experiments in a 3D simulated world at an accelerated pace. This allows the model to intuitively learn the rules and design behaviors without extensive annotation. However, SIMA’s approach differs from this traditional method.

According to Tim Harley, one of the project leads, traditional simulator-based agent training relies on reinforcement learning, which requires a reward signal from the game or environment. However, evaluating such a reward signal for each possible goal in numerous games is not feasible. Instead, SIMA is trained using imitation learning from human behavior, with goals described in text. This approach allows the model to pursue a wide variety of tasks without being limited by a strict reward structure.

Companies are exploring similar open-ended collaboration and creation approaches. For example, conversations with non-player characters (NPCs) are being studied as potential opportunities for employing LLM-type chatbots. Additionally, AI research into improvisation and simulated interactions in games is yielding intriguing results.

While experiments in infinite games like MarioGPT present another avenue of exploration, the focus remains on Deepmind’s SIMA and its ability to learn and adapt to multiple 3D games, providing a more natural and interactive gaming experience.

FAQ:

1. What is SIMA?
SIMA stands for scalable instructable multiworld agent, which is a deep learning model created by Google Deepmind. It learns to play multiple 3D games like a human and aims to understand and act on verbal instructions.

2. How is SIMA trained?
SIMA is trained using numerous hours of gameplay videos recorded by humans. Data labelers provide annotations to help the model associate visual representations of actions, objects, and interactions. Videos of players instructing each other in the game are also recorded to enhance the learning process.

3. Can SIMA play games it hasn’t encountered before?
Yes, AI agents trained on multiple games performed better when exposed to games they haven’t encountered before. However, unique mechanics and specific terms in different games can still pose challenges for AI models.

4. What is the goal of the researchers behind SIMA?
The researchers aim to develop a more natural and cooperative game-playing companion than the current rigid, hard-coded characters. They want SIMA to play alongside human players, allowing them to provide instructions and work together as a team.

5. How does SIMA perceive the game?
SIMA perceives the game solely through the pixels on the screen, similar to how humans do. It learns and adapts in a manner similar to humans, exhibiting emergent behaviors and adding an element of unpredictability to its interactions.

6. How does SIMA differ from traditional simulator-based agent training?
Traditional simulator-based agent training relies on reinforcement learning, which requires a reward signal from the game or environment. SIMA, on the other hand, is trained using imitation learning from human behavior with goals described in text. This approach allows the model to pursue a wide variety of tasks without being limited by a strict reward structure.

Key Terms and Jargon:
– NPCs: Non-player characters, referring to AI characters in games.
– Generalization: The ability of an AI trained on one set of games to play others it hasn’t encountered before.
– Reinforcement Learning: A type of machine learning where an agent learns to take actions in an environment in order to maximize rewards.

Suggested Related Links:
– Deepmind

The source of the article is from the blog radiohotmusic.it