Search SceneGeneration on X

Search results for SceneGeneration

SceneGeneration community

One keyword maps to one global community path.

Create community

People

Not Found

Tweets including SceneGeneration

cv usk@cv_usk

2026.06.16 21:38

🏠 Describe a room in plain text, and out comes a complete, physics-ready scene a robot can actually interact with. That's SceneSmith, an ICML 2026 Spotlight from MIT and Toyota Research Institute. Title: nepfaff/scenesmith (SceneSmith) URL: 🏠 Overview SceneSmith is an agentic system that generates simulation-ready indoor scenes from natural language. It produces furniture, wall-mounted mirrors and artwork, ceiling chandeliers, and small tabletop items — all with physical properties like mass and inertia — so the scenes can be used directly for robot training and evaluation. ❓ Challenges Solved Building realistic indoor scenes for robot simulation has meant manual modeling or tedious scene composition, a major bottleneck for scaling robot evaluation and training. SceneSmith removes this by automatically generating diverse, contextually coherent scenes from text prompts. 💡 Methodology & Approach Scene generation runs as a five-stage sequential pipeline. ・Floor plan generation (walls and floor layout) ・Large furniture placement ・Wall-mounted objects (mirrors, artwork, shelves, clocks) ・Ceiling fixtures (chandeliers, pendant lights, ceiling fans) ・Manipulable small objects Checkpoints are saved automatically after each stage, so you can resume or branch midway. Scene reasoning and task decomposition use a VLM agent (GPT-5). 🎯 Use Cases & Tech ・3D assets are generated with the high-quality SAM3D (recommended) or Hunyuan3D-2, with retrieval from HSSD and Objaverse also supported ・AmbientCG PBR materials are applied via CLIP-based semantic search, and articulated objects from ArtVIP and PartNet-Mobility are handled with joint kinematics ・Output is native Drake format, with export to MuJoCo, USD, and Isaac Sim 📊 Highlights ・From a task like "find a fruit from the bowl and place it on a plate," it generates multiple constrained scene variations and supports robot evaluation ・A 151-word prompt yields a community center, even inferring context like placing ping pong paddles and balls near the table ・Geometry generation is distributed across GPUs, with bubblewrap isolation preventing rendering OOM #Robotics# #SceneGeneration#

Forward to community

cv usk@cv_usk

2026.06.13 08:29

🏠 Just specify furniture with text or images, and get a style-consistent 3D indoor scene generated automatically, about 85% faster than MMGDreamer. Title: FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow URL: 📝 Overview FlowScene generates high-fidelity 3D indoor scenes from a multimodal scene graph that fuses text and images. It produces layout, shape, and texture in three branches via a straight-line rectified flow, keeping style consistent across the whole scene. ❓ Challenges Solved Language-driven retrieval methods lack object-level control and style coherence, while graph-based methods struggle with high-quality textures. FlowScene resolves both weaknesses at once. 💡 Methodology & Proposed Approach ・It takes a multimodal graph where nodes fuse text descriptions and image features (text-only, image-only, or mixed) ・An InfoExchangeUnit densely exchanges node information during sampling to satisfy both individual and holistic conditions ・Layout (3D boxes), shape (VQ-VAE latents), and texture (anchored to geometry) are generated by independent denoisers ・Texture is denoised with geometry fixed, so even text-only nodes get style-consistent textures through information exchange 🎯 Use Cases It fits interactive scene design for interior design and manufacturing, VR/AR content creation, and building simulation environments for robotics. 📊 Experimental Results ・Bedroom FID improves from 42.38 to 35.01, 17.4% better than MMGDreamer ・CLIPScore of 0.2386 is the best of all methods, and users rate style consistency 8.72/10 ・Inference without textures takes 6.83s, about 85% faster than MMGDreamer's 45.34s ・Object quality also improves, e.g. a 43.90% better minimum matching distance on nightstands #3DGeneration# #GenerativeAI#

Forward to community