註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Jihan Yang
@jihanyang13
@amilabs; Prev. @NYU_Courant @HKUniversity; Researcher in Deep Learning, Computer Vision.
加入 November 2018
506 正在關注    1.2K 粉絲
Camera pose matters for video understanding! Today's MLLMs excel at recognizing activities, but still struggle with the underlying space and ego/object dynamics in video. We trace this gap to a missing piece: camera pose. Introducing Cambrian-P: a multimodal LLM natively grounded in camera pose. (1/n)
顯示更多
0
1
70
11
轉發到社區