Built a "YouTube realtime copilot" browser extension using OpenAI's realtime 2 API:
The agent watches the video alongside you, and can answer any question you have about what was just said via realtime voice chat.
The crazy part to me is: It can differentiate the YouTube's audio stream and your voice, so it doesn't confuse the video as commands, and stays silent unless you ask something!
显示更多