Register and share your invite link to earn from video plays and referrals.

Jihan Yang
@jihanyang13
@amilabs; Prev. @NYU_Courant @HKUniversity; Researcher in Deep Learning, Computer Vision.
Joined November 2018
506 Following    1.2K Followers
Camera pose matters for video understanding! Today's MLLMs excel at recognizing activities, but still struggle with the underlying space and ego/object dynamics in video. We trace this gap to a missing piece: camera pose. Introducing Cambrian-P: a multimodal LLM natively grounded in camera pose. (1/n)
Show more