ðºïž æå
端ã®GPT-5ã§ããçŸå®äžçã®ç©ºéã¿ã¹ã¯ã®æåçã¯ããã14.4%ââã鿢ç»ãçºããŠçããã ãã§ã¯æž¬ããªããAIãšãŒãžã§ã³ãã®ãèœåçãªç©ºéæšè«ãã®åŒ±ãããã¶ãåºãæ°ãããã³ãããŒã¯ãç»å ŽããŸããã
ã¿ã€ãã«: SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks
URL:
ð æŠèŠ
SpatialWorldã¯ããã«ãã¢ãŒãã«LLMãèŠèŠã®ã¿ã®äžäººç§°èŠç¹ã§ã3Dç°å¢ãèœåçã«æ¢çŽ¢ããªããã¿ã¹ã¯ãè§£ããããæž¬ããã³ãããŒã¯ã§ããå±å
ã»å±å€ã»ããžã¿ã«ã²ãŒã ã«ããã8ã€ã®ç°ãªãã·ãã¥ã¬ãŒã¿ãå
±éãããã³ã«ã§çµ±åãã人æã§äœã£ã760ã¿ã¹ã¯ã§15ã®æå
端ã¢ãã«ãè©äŸ¡ããŸããããšãŒãžã§ã³ãã¯äºåã«äžããããå°å³ãæ£è§£ã®æé ãªãã«ãèªåã§èŠãŠãåããŠã倿ããå¿
èŠããããŸãã
â 解決ãã課é¡
åŸæ¥ã®ç©ºéæšè«ãã³ãããŒã¯ã¯ãéçãªVQAãé²ç»æžã¿åç»ã«ããååçãªè©äŸ¡ã«äŸåããŠããŸããããããããã§ã¯ããšãŒãžã§ã³ããèªãèŠç¹ãåãããŠèŠèŠçãªèšŒæ ãéããéšåçã«ããèŠããªãç¶æ³ã®äžã§ãã®å Žã§èšç»ãç«ãŠçŽãããšããçŸå®äžçã«å¿
èŠãªã€ã³ã¿ã©ã¯ãã£ããªç©ºéçè§£ãæž¬ããŸãããéçãªã·ãŒã³ãèªèã§ããããšãšãæªç¥ã®ç©ºéã§å®éã«åããŠèª²é¡ãè§£ããããšã®éã«ã¯ã倧ããªéããããã£ãã®ã§ãã
ð¡ æ¹æ³è«ãšææ¡ææ³
ã»èª²é¡ãèŠèŠã®ã¿ã®POMDPïŒéšå芳枬ãã«ã³ã決å®éçšïŒãšããŠå®åŒåããŸã
ã»ãšãŒãžã§ã³ãã¯èªç¶èšèªã®ãŽãŒã«ãšããã€ãã£ãè§£å床ã®äžäººç§°RGBç»å1æã ããåãåããæ·±åºŠã»å°å³ã»æå³ã¡ã¿ããŒã¿ã¯äžåäžããããŸãã
ã»è¡åã¯ããã²ãŒã·ã§ã³ãèŠç¹å¶åŸ¡ãç©äœãšã®ã€ã³ã¿ã©ã¯ã·ã§ã³ãã¿ã¹ã¯å®äºãå«ãããã¹ãããŒã¹ã®é«ã¬ãã«ã€ã³ã¿ãŒãã§ãŒã¹ã§æç€ºããŸã
ã»å±å
ïŒAI2-THORãProcTHORãVirtualHomeïŒãå±å€ïŒCARLAãEmbodiedCityïŒãããžã¿ã«ã²ãŒã ïŒBlock3DãSnake3Dãã«ãŒããã¯ãã¥ãŒãïŒã®8ããã¯ãšã³ããçµ±åããŸã
ã»è©äŸ¡ã¯éäžã®è»è·¡ã®äžèŽã§ã¯ãªããæçµçãªçµç«¯ç¶æ
ããŽãŒã«ãæºããããã§å€å®ãã人æã§åŠ¥åœæ§ã確èªããŸã
ã»æåçã«å ãã人éã®åç
§è»è·¡ãšæ¯ã¹ãã¹ãããå¹çãæž¬ãããšã§ãå¹çã®æªããå¯èŠåããŸã
ð¯ ãŠãŒã¹ã±ãŒã¹
å®¶åºçšãããããèªåŸãšãŒãžã§ã³ãã®ç©ºéèœåããå®ç°å¢ãžé
åããåã«çµ±äžçãã€å
¬å¹³ã«è©äŸ¡ããåºç€ã«ãªããŸããããã²ãŒã·ã§ã³ãšç©äœæäœãçµã¿åãããé·æã¿ã¹ã¯ã®ã©ãã§ã€ãŸããã®ããäœç³»çã«èšºæã§ããç©ºéæšè«ã¢ãã«ã®æ¹åã«åããå³å¯ãªãã¹ãããããšããŠæŽ»çšã§ããŸãã
ð å®éšçµæ
ã»15ã®æå
端ã¢ãã«ãè©äŸ¡ããç©çã¿ã¹ã¯ã®æåçã¯GPT-5ã14.4%ãQwen-3.5-397Bã12.2%ãGemini-3.1-Proã9.2%ãKimi-K2.5ã9.2%ã«ãšã©ãŸããŸãã
ã»ããžã¿ã«ã²ãŒã ã§ã¯Gemini-3.1-Proã39.0%ã§æé«ãGPT-5ã36.4%ãšç¶ããŸãã
ã»è€éãå¥ã«èŠããšãã€ã³ã¿ã©ã¯ã·ã§ã³ã®ã¿ã®ã¿ã¹ã¯ã¯å¹³å50.2%ã ã£ãã®ã«å¯Ÿããããã²ãŒã·ã§ã³ã®ã¿ã¯8.6%ãäž¡è
ãçµã¿åãããè€åã¿ã¹ã¯ã¯ããã4.2%ãŸã§æ¥èœããŸãã
ã»æåçãè¿ãã¢ãã«ã©ããã§ãå¹çã¹ã³ã¢ã¯å€§ããç°ãªããå€ãã®ã¢ãã«ã詊è¡é¯èª€ã«é Œã£ãŠåããŠãã宿
ãæããã«ãªããŸãã
ã»ç°å¢ããšã«ã¢ãã«ã®é äœã倧ããå
¥ãæ¿ãããå
šã«ããŽãªãæ¯é
ããäžèœãªã¢ãã«ã¯ååšããŸããã§ãã
#
AIãšãŒãžã§ã³ã# #
SpatialReasoning#