DeepMind has released SIMA 2, an advanced version of its AI agent designed to operate inside richly simulated 3D worlds. The new model blends language understanding, visual perception and action planning, allowing it to follow instructions, form strategies and learn through interaction rather than relying only on static datasets.
According to DeepMind, SIMA 2 uses the Gemini model to interpret goals, explain its reasoning and adapt to new environments that it has never encountered.
The earlier version, SIMA 1, could follow more than 600 natural-language commands in controlled settings. SIMA 2 moves beyond that baseline by showing stronger generalisation across games with different physics, mechanics and objectives.
In internal testing, DeepMind placed the agent in unfamiliar virtual worlds, where it was able to make progress on tasks that were not part of its training material. The company describes this work as a step toward AI systems that learn continuously inside interactive spaces.
How SIMA 2 Works
SIMA 2 receives information from the environment through visual scenes, keyboard-style inputs and controller signals. It can respond to sketches, symbols, gestures and multi-language instructions.
The agent breaks down goals into smaller actions, plans sequences and adjusts behaviour as the context changes. DeepMind trained it through a combination of human demonstrations and large-scale synthetic experience generated as the agent explored on its own.

Performance improvements were observed in games such as ASKA, MineDojo and other simulation platforms. These gains came not from memorisation but from the agent’s ability to recognise patterns and adapt in real time.
DeepMind notes that SIMA 2 is still a research model and not ready for commercial or real-world deployment.
Why SIMA 2 Matters
Work on agents like SIMA 2 reflects an expanding interest in AI that can learn by doing rather than only predicting text. Virtual 3D environments offer a space to test planning, spatial understanding and long-horizon tasks, all of which are relevant to robotics, training systems and simulation-based research.
Even with its improvements, SIMA 2 has clear limitations. It struggles with long tasks that require extended planning, depends on large amounts of computation and faces challenges when translating virtual actions into movements that could transfer to physical systems.
These constraints show why embodied intelligence remains an open research area.
What Comes Next
DeepMind plans to release further technical details through research papers and evaluation material. The next stage will involve testing whether the methods behind SIMA 2 can support work in robotics or more realistic simulations.
Researchers will also watch how other labs respond, and whether this approach becomes a foundation for broader agent-based AI development.
