This threat model is concerned with running arbitrary code generated by or fetched by an AI agent on host machines which contain secrets, sensitive files, and/or exfoliate data, apps, and systems which should not be lost.
What about the threat model where an agent deletes your entire inbox? Or sends your calendar events to a server after prompt injection? Bank transfers of the wrong amount to the wrong address etc. all these are allowed under the sandboxing model.
We need fine grained permissions per-task or per-tool in addition to sandboxing. For example: "this request should only ever read my gmail and never write, delete, or move emails".
Sandboxes do not solve permission escalation or exfiltration threats.
The biggest one (as Karpathy notes) is having skills for how to write a (slack, discord, etc) integration, instead of shipping an implementation for each.
Call it “Claude native development” if you will, but “fork and customize” instead of batteries-included platforms/frameworks is going to be a big shift when it percolates through the ecosystem.
A bunch of things you need to figure out, eg how do you ship a spec for how to test and validate the thing, make it secure, etc.
How long before OSs start evolving in this way? You can imagine Auto research-like sharing and promotion upstream of good fixes/approaches, but a more heterogenous ecosystem could be more resistant to attacks if each instance had a strong immune system.
I'm not sure what is the advantage. Each user will have to waste time and tokens for the same task, instead of doing it once and and shipping to everyone.
OCI is a good choice of reuse, they aren't having the agent reimplement that. When there is an existing SDK, no sense in rebuilding that either. Code you don't use should be compiled away anyhow.
In order for it to be 'once': all hardware must have been, currently be, and always will be: interchangeable. As well as all OS's. That's simply not feasible.
The strength of open source software is collaboration. That many people have tried it, read it, submitted fixes and had those fixes reviewed and accepted.
We've all seen LLMs spit out garbage bugs on the first few tries. I've written garbage bugs on my first try too. We all benefit from the review process.
I would rather have a battle tested base to start customizing from than having to stumble through the pitfalls of a buggy or insecure AI implementation.
Also seems like this will further entrench the top 2 or 3 models. Use something else and your software stack looks different.
I’m assuming here an extrapolation of capabilities where Claude is competitive to the median OSS contributor for the off-the-shelf libraries you’d be comparing with.
As with most of the Clawd ecosystem, for now it probably is best considered an art project / prototype (or a security dumpster fire for the non-technical users adopting it).
> The strength of open source software is collaboration. That many people have tried it, read it, submitted fixes and had those fixes reviewed and accepted
I do think that there is room for much more granular micro-libraries that can be composed, rather than having to pull in a monolithic dependency for your need. Agents can probably vet a 1k microlibrary BoM in a way a human could never have the patience to.
(This is more the NPM way, leftpad etc, which is again a security issue in the current paradigm, but potentially very different ROI in the agent ecosystem.)