Register and share your invite link to earn from video plays and referrals.

Lilac
@LilacML
(YC S25) Lilac serves fast model inference on idle GPUs.
18 Following    123 Followers
We now have an official discord server
We're excited to announce our partnership with @MiniMax_AI! Read more at
We've just launched Kimi K2.6: Come try it at !
Cache pricing is in 🎆 for GLM 5.1 and Kimi K2.5! Lowest cost in the industry at the highest throughput for shared endpoints. Read more at:
Real time status now available to all on our endpoints -- see it by hitting or go to our website: Fast Throughout, Low Latency, Low Price!
Show more
We are excited to be working with @InfronAI !
GLM 5.1 and Kimi K2.6 are now included in the subscription, including their Thinking variants. For these two model families, input tokens count 2x against included-input limits (for example, 1M input tokens on GLM 5.1 or Kimi K2.6 uses 2M included-input tokens).
Show more
Lilac's models are now available on -- Happy to be working with the amazing team @NanoGPTcom
GLM 5.1 up to 135 TPS per user! See us on @NanoGPTcom breaking records at one of the lowest costs
Our GLM 5.1 endpoint is averaging 70 TPS under full-load right now 😋 Thanks to everyone using it!
We just launched GLM 5.1 at $0.90/M input and $3.00/M output — the lowest price among GLM 5.1 providers. We can price it this way because we serve inference on idle enterprise GPUs -- solving wasted compute while bringing affordable yet reliable inference to everyone. Try it:
Show more
Idle enterprise GPUs make cheap inference possible. Lilac is serving Kimi K2.5 at $0.40/M input and $2.00/M output. 25% off for 3 months above 1B+ tokens/month. No contracts. No minimums. In our OpenRouter benchmark, we were the lowest-priced provider in a comparable speed band. Pricing + benchmark snapshot + API access:
Show more
#NVIDIAGTC# Excited to be there, let's talk GPUs!
GPU infrastructure is broken. Most companies pay for 24/7 capacity but only use a fraction of it. Expensive hardware sits idle overnight, burning cash. This we learned during our time @ycombinator. We built a Kubernetes operator to fix this. One kubectl command turns idle GPU time into revenue. You define the rules; we run paid AI workloads on your spare capacity. When you need the GPUs back, they return instantly with zero disruption. Your cluster. Your rules. We just make the idle time pay for itself. Lilac is now onboarding design partners. If you're running GPUs on K8s and want to stop paying for "dark capacity," let's talk. Full story:
Show more
We are at @Ai4Conferences in Las Vegas! Come talk to us about getting your team GPUs! We are excited to be partnering with @autonomous_labs these 3 days showcasing Lilac on their EdgeAI rent-to-own clusters
Show more
🚀 @LilacML launched! Lilac Finds You GPUs! "Lilac! The open-source tool that ensures your data scientists always have GPUs." 🌐 Congrats Lucas Ewing & Ryan Ewing!!
Lilac (@LilacML) is an open-source tool that ensures your data scientists always have enough GPUs for their work. They seamlessly connect compute from any source, on-prem or cloud.
Show more