what's stopping nous research from building this?
Everyone who signed up in the last hour should have gotten invites to the Hermes Agent Mobile App.
Happy testing!
the things that actually changed how i work are small and unglamorous, a tmux config, a control script, the git setup my agents live in, and i open every one of them daily.
build something you come back to.
Show more
most people treat git as a chore. commit, push, don't forget, a discipline you impose on yourself. in an agentic setup it stops being that, git becomes the thing your agents talk through.
one agent does work and commits. the next one pulls and now it has everything, the code, the docs, the context the last agent left behind. you spin up a fresh agent and it is not a blank slate, it pulls the repo and already knows, no re briefing, no pasting context into a chat window.
the agents never have to talk to each other directly. one leaves state in the repo, the next picks it up, async and durable, and every commit is a log of exactly what each agent did, so you can audit the whole run after.
that is why the git server being private and yours matters. it is not about secrecy, it is that the repo is infrastructure now, the shared memory the whole system runs on. a merge agent sits on top and handles the conflicts.
private git isn't a step you enforce, it's the medium.
Show more
@DavidBennett__ not nagging, it's the architecture. i use private git, not github, the repo of code and docs is the memory itself. every new agent pulls it, gets full context, commits back. a merge agent handles the merging. git isn't a step you enforce, it's how the agents share state.
Show more
@DavidBennett__ not nagging, it's the architecture. i use private git, not github, the repo of code and docs is the memory itself. every new agent pulls it, gets full context, commits back. a merge agent handles the merging. git isn't a step you enforce, it's how the agents share state.
Show more
i posted a list about tailscale and tmux, the most unsexy thing i could think of, and it's about to cross 100k views.
that's not me, that's the signal, the agentic setup corner is way bigger and hungrier than the timeline lets on and almost nobody is posting into it. more of this coming.
Show more
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
Show more
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
Show more
ok, be honest, no one is watching. of the five foundations in this post, how many do you actually have running right now? not "i've heard of them," running. zero is a real answer. i sat at zero for a year. the corner is small, let's find out how small.
Show more
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
Show more
someone asked me to elaborate on #
1#, tailscale. fair, it is the one that looks boring and is actually load bearing.
tailscale builds a private mesh network across every machine you own. laptop, desktop, a rented gpu node, your phone, all of them join one network, a tailnet. every device gets a stable private address that never changes, and any device reaches any other directly, encrypted, peer to peer.
why this is #
1# and not #
4#: an agent that works across machines has to reach those machines. without a tailnet you are fighting public IPs, port forwarding, firewall rules, NAT, jump hosts, and every one of those breaks the second an IP changes or you switch networks. the agent does not orchestrate anything it cannot address.
with a tailnet that whole problem disappears. the agent on one box reaches every other box by a fixed name, from anywhere, and it keeps working. your stack stops being machines you can reach when the network cooperates and becomes one coherent system.
and the phone part is not a gimmick. tailscale on your phone puts you on the same tailnet, ssh into any node, check an agent, restart a run, from a coffee shop, from bed, from anywhere. you are never locked out of your own stack.
set this up first. everything else in that post assumes you already have it.
Show more
@sudoingX can you elaborate on #
1# , why is it so important and how does it benefit the agents?
and here is proof the five hold.
a devops engineer in the replies runs every one of them, then takes them up a tier, hermes agents in a GKE cluster with self hosted models behind them.
that is the part most people miss. the foundations do not change when you scale from one box to a cluster, they just take bigger forms. the principle is the same at one machine and at a hundred.
the agentic corner is small and it is real. this is what it looks like.
Show more
@sudoingX I utilize every one of these . I run my hermes agents in a gke cluster with access to self hosted models as well !
the replies are doing exactly what i hoped, adding tools i didn't list.
this one is worth catching: reptyr. start a long process bare, outside tmux, then realize you need it inside a session, reptyr reparents the running process into tmux after the fact. the rescue tool for foundation 3.
the move is still start in tmux so you never need it. but the day you forget, this is the save.
Show more
@sudoingX reptyr seems like a useful tool too. You can move a running session between terminals.
a real one from today, while this post is fresh.
i nearly lost a 97MB agent session. it grew so large it will not reload, the whole working plan trapped in a conversation that outgrew itself. and a worktree i was using vanished off disk, the directory just gone.
here is the split that matters. the code was committed to a git repo, so it is still in there and recoverable, even though the working directory died. what i am scrambling to recover is the context i left in a chat window instead of the repo.
foundation 4, not as advice, as something that happened to me hours ago. the repo is the memory layer. the chat is not.
Show more
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
Show more
be honest, you bookmarked this and still have zero of the five running. no judgment, i did the same for a year. today is a good day to fix that, anon.
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
Show more
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
Show more
first fresh AMD number in and it's a good one.
an RX 7900 XTX, 24GB, running Qwen 3.6 35B-A3B at iQ4, 68.79 tok/s generation on vulkan. if you've been wondering whether AMD pulls its weight for local models, that is a real, plannable number.
and anon if you run AMD, keep them coming. model, quant, card, tok/s, one line. every entry sharpens the picture.
Show more
@sudoingX AMD Radeon RX 7900 XTX 24GB, Qwen 3.6 35B iQ4 NL, Vulkan, LM Studio. 1025 tok/s read and 68.79tok/s write.
if cmake scares you, ollama is right there. no judgment. okay maybe a little.
people keep asking what engine i use. no lm studio. no ollama.
i compile llama.cpp from source every time for personal inference. no abstraction layers.
if you're serious about local inference, start at source level. it's a no brainer. here's why.
when you compile from source you control everything. which cuda arch to target. which quantization kernels to enable. flash attention flags. context size limits. you're not waiting for some gui app to update when gguf format changes or a new quant drops. you pull, you build, you run. minutes not days.
lm studio and ollama are fine for trying things. but the moment you need custom context lengths, specific kv cache configs, or hardware specific optimizations like the GB10 tensor cores on my spark those abstractions become walls.
compiling from source means when something breaks you know exactly where. when something is slow you know exactly why. there's no black box between you and the metal.
that's the difference between using local ai and understanding local ai.
Show more
i've run a stack of models across a single 3090, a 5090, and a 128GB DGX Spark. exactly three are worth building on. the honest list.
the three worth it:
> 1. StepFun Step-3.5 Flash, the REAP pruned 121B MoE (Q6, DGX Spark) a 121 billion parameter mixture of experts running on a single desktop box. the most worth-it model in everything i've tested.
> 2. Qwen 3.6 27B Dense, Q4 (single RTX 3090) the undisputed king of the 24GB tier. one shot a playable game, around 41 tok/s, fits with context headroom to spare. one 24GB card, this is your answer.
> 3. NVIDIA Nemotron 3 Nano Omni, 30B-A3B (DGX Spark) the best multimodal i've tested for video classification work. vision in, runs clean on the Spark.
the rest, ran them, they hold up fine:
on the Spark: DeepSeek V4 Flash 158B,
GLM 4.7 Flash, GLM 4.5 Air REAP 82B-A12B, Gemma 4 26B-A4B, Qwen3-VL 235B-A22B, Qwen3 Coder 30B-A3B, Qwen3 30B-A3B, Carnice 35B-A3B.
on consumer GPUs:
Kimi K2.5 1T, Qwen3-Coder-Next 80B, Hermes 4.3 36B, Qwen 3.5 27B Dense.
single 3090 to a 128GB Spark, that's the range. the three up top are the ones worth your hardware today.
Show more
i've run a stack of models across a single 3090, a 5090, and a 128GB DGX Spark. exactly three are worth building on. the honest list.
the three worth it:
> 1. StepFun Step-3.5 Flash, the REAP pruned 121B MoE (Q6, DGX Spark) a 121 billion parameter mixture of experts running on a single desktop box. the most worth-it model in everything i've tested.
> 2. Qwen 3.6 27B Dense, Q4 (single RTX 3090) the undisputed king of the 24GB tier. one shot a playable game, around 41 tok/s, fits with context headroom to spare. one 24GB card, this is your answer.
> 3. NVIDIA Nemotron 3 Nano Omni, 30B-A3B (DGX Spark) the best multimodal i've tested for video classification work. vision in, runs clean on the Spark.
the rest, ran them, they hold up fine:
on the Spark: DeepSeek V4 Flash 158B,
GLM 4.7 Flash, GLM 4.5 Air REAP 82B-A12B, Gemma 4 26B-A4B, Qwen3-VL 235B-A22B, Qwen3 Coder 30B-A3B, Qwen3 30B-A3B, Carnice 35B-A3B.
on consumer GPUs:
Kimi K2.5 1T, Qwen3-Coder-Next 80B, Hermes 4.3 36B, Qwen 3.5 27B Dense.
single 3090 to a 128GB Spark, that's the range. the three up top are the ones worth your hardware today.
Show more
1am now. agents running. laptop beside the bed. going to rest now while they cook. something will be ready when i wake up. i keep it beside me because when i get up for water at 3am i always end up prompting. goodnight anon.
Show more
if you run a single 24gb gpu, a 3090, a 4090, a 7900 xtx, whatever gets you the 24 gigs, the no brainer pick is qwen 3.6 27b dense at q4. not close.
i have run the tier. it fits in 24gb with real context room to spare, it keeps the reasoning smaller models lose, it pushes around 41 tok/s on a single 3090, and i watched it one shot a playable game start to finish, zero iterations.
nothing else in that vram class does what this model does. undisputed king of the 24gb tier, and there is nothing you can say to change my mind.
Show more