注册并分享邀请链接,可获得视频播放与邀请奖励。

Jihan Yang
@jihanyang13
@amilabs; Prev. @NYU_Courant @HKUniversity; Researcher in Deep Learning, Computer Vision.
加入 November 2018
506 正在关注    1.2K 粉丝
Camera pose matters for video understanding! Today's MLLMs excel at recognizing activities, but still struggle with the underlying space and ego/object dynamics in video. We trace this gap to a missing piece: camera pose. Introducing Cambrian-P: a multimodal LLM natively grounded in camera pose. (1/n)
显示更多
0
1
70
11
转发到社区