2. (Real time fact checking) - The Interaction Models hear you speak and fact-checks you in real time — like having a teammate who's always paying attention.
I’ve been telling people this a lot today: I enjoy so much working with people who care about what they are building and craftsmanship. It is a privilege to have a chance to work on something I’m passionate about, beyond making a living. I cherish it and don’t take it for granted.
On-policy distillation provides an elegant way to use the teacher model as a process reward model to provide dense reward while preventing SFT style "OOD shock" during rollout.
Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other approaches for a fraction of the cost.
GPUs are expensive and setting up the infrastructure to make GPUs work for you properly is complex, making experimentation on cutting-edge models challenging for researchers and ML practitioners.
Providing high quality research tooling is one of the most effective ways to improve research productivity of the wider community and Tinker API is one step towards our mission there.
Tinker API is built on top of our experimental results on fine-tuning with LoRA:
Beta starts and you can join the waitlist today:
Looking through those little hidden gem stories in the footnote, you will find it so inspiring that researchers with interests on the same topic are able to work together to advance a field despite their roles and locations. This is the power of open science and community.
Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.
We explore a fundamental understanding of the geometry of neural network optimization.
I recently joined @thinkymachines -- super excited to work with the team, I think we have the highest density of research talent in the world 🙂
we have a very ambitious roadmap ahead, the right team to work on it, & I think now is a great time to join; you should reach out to the team if that excites you!
We have been working hard for the past 6 months on what I believe is the most ambitious multimodal AI program in the world. It is fantastic to see how pieces of a system that previously seemed intractable just fall into place. Feeling so lucky to create the future with this talented and aligned team.
Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence.
We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're excited that in the next couple months we’ll be able to share our first product, which will include a significant open source component and be useful for researchers and startups developing custom models. Soon, we’ll also share our best science to help the research community better understand frontier AI systems.
To accelerate our progress, we’re happy to confirm that we’ve raised $2B led by a16z with participation from NVIDIA, Accel, ServiceNow, CISCO, AMD, Jane Street and more who share our mission.
We’re always looking for extraordinary talent that learns by doing, turning research into useful things. We believe AI should serve as an extension of individual agency and, in the spirit of freedom, be distributed as widely and equitably as possible. We hope this vision resonates with those who share our commitment to advancing the field. If so, join us.
I still find it mysterious whether and how intelligence and capabilities transfer between domains and skills - from meta learning during early days to more recent question like whether solving maths helps writing a good essay.
Sometime I feel a bit pessimistic given not enough evidence I’ve seen. Would like to get more suggestions and pointers to papers on this topic of generalization in the thread! 🧵
Probably the first product Thinky will build is a full panel of dials that researchers can use to physically adjust all the hparams during training. We gonna do hardware one day and it is the time 😂
Giving your models more time to think before prediction, like via smart decoding, chain-of-thoughts reasoning, latent thoughts, etc, turns out to be quite effective for unblocking the next level of intelligence.
New post is here :)
“Why we think”:
When a new dataset comes out, I get excited and check it out and then only realize that this is another meta-mixed dataset combining a collections of other existing datasets. My brain immediately acts like "oh fork ... contamination!" No meta-meta-mixed dataset plzzzz :lolsob:
About 650 / 770 signed at this moment. As people start waking up, more will come. All the efforts started after 1:30 AM, 500+ within two hours and all of this after 2 crazy days with very little sleep.