Lyra 2.0 is an __open source framework__ developed by the Nvidia Spatial Intelligence Lab that transforms a single image into a __persistent explorable 3D world__. The system relies on a video diffusion model, generates a controlled camera path then reconstructs the result as __3D Gaussian Splats__ and meshes. Distributed under Apache 2.0 license with weights and code available on Hugging Face and GitHub, it’s usable for robotic simulation, storyboarding and immersive creation.
What is Lyra 2.0 (Nvidia)?
Lyra 2.0 is an open source research framework dedicated to generating persistent 3D worlds from images. Where other approaches produce video sequences limited in time, Lyra 2.0 focuses on spatial and temporal consistency to provide an explorable real-time environment, exportable to engines like NVIDIA Isaac Sim. The project is led by the Nvidia Spatial Intelligence Lab and published under Apache 2.0 license, with all code and weights available on Hugging Face and GitHub. This openness makes it a reference for both academic research and industry wanting to integrate 3D generation into products.
Main Features
Lyra 2.0 proposes several technical innovations. The pipeline starts from a single source image and generates a camera path video using a video diffusion model based on Wan 2.1-14B. This video is then reconstructed as 3D Gaussian Splats and meshes, enabling real-time exploration and export to physics engines. To solve classic coherence issues, Lyra 2.0 introduces two strong ideas: per-image geometry for information routing, which reduces spatial loss, and self-augmented training that teaches the model to correct its own temporal drift. The result is a more stable, more consistent and more usable environment than previous approaches. The framework integrates tools for easily exporting scenes to Isaac Sim, opening the path to robot training based on generated environments. Lyra 2.0 relies on a modular pipeline that researchers can extend, modify or combine with other models. Open source distribution comes with inference scripts, pre-trained models and example notebooks to facilitate adoption.
Use Cases
Lyra 2.0 addresses several creator and researcher profiles. Robotics labs use it to train their agents in large-scale generated 3D environments, reducing dependence on expensive physical scans. Video game and virtual reality studios exploit it to produce preliminary sets or experimental environments. Film production teams use it for immersive storyboarding, transforming concepts into explorable scenes before filming. Computer vision researchers integrate the framework into their own pipelines to study spatial and temporal coherence. Augmented reality creators finally explore the possibility of generating personalized environments from reference images.
Advantages
Adopting Lyra 2.0 brings several benefits for advanced users. The speed of producing explorable 3D scenes is radically superior to traditional pipelines, which require manual modeling, texturing and lighting. Apache 2.0 license authorizes commercial use without constraints, making the framework attractive to startups and publishers. Compatibility with Nvidia tools like Isaac Sim simplifies integration into existing chains. Improved spatial and temporal quality enhances environment reliability for simulation and training AI agents. Finally, code and weight openness fosters an active community that contributes to framework evolution and proposes optimizations suited to different hardware.
Pricing
Lyra 2.0 is an open source project distributed free under Apache 2.0 license. Code is available on GitHub, weights on Hugging Face, and local or cloud use of the framework requires no additional commercial license. Associated costs essentially concern GPU resources needed for inference or training, which can be substantial depending on use cases. For teams without their own infrastructure, cloud providers like AWS, GCP or specialized platforms offer H100 GPUs or equivalents suited to these workloads.
Conclusion
Lyra 2.0 is a major advance for generating 3D worlds from images. Its openness, quality and integration into the Nvidia pipeline make it a reference framework for research and some industrial uses. For general public users, the tool will remain too technical, but for studios, labs and ambitious ML teams, it’s a must-have.