It absolutely is and in some ways we've only just started! Although we definitely shouldn't move fast and break things with living animals and our food supply;)
That would be glorious! If ChatGPT doesn't get the permissions right on the first try I know that I'm going to have to spend the next hours reading the documentation or trying random combinations to get a token that works.
Why is "on the first try" so important? What's wrong with telling it your end goal and letting it figure out the exact right combo while you go off and work on something else?
I love the concept but I've never hosted such a terrible piece of software. Every update breaks something new or introduces another "anti-feature" that's enabled by default.
The documentation is often lagging behind and the changelog has such a low signal to noise ratio that you need a LLM to figure out what upgrading will break this time. For now I've just given up on updates and I've been patching bugs directly in the JS when they bother me enough.
If OpenClaw is the future of software I'm honestly a bit scared for the industry.
I'm open to suggestions, I tried Zeroclaw and Nullclaw but they're bad in their own way. I would like something that's easy to run on Kubernetes with WhatsApp integration and most important, stable releases.
> If OpenClaw is the future of software I'm honestly a bit scared for the industry.
I think it's mainly the industry wannabes gathering around a "sexy" brand name again, when they're really more interested in "AI as personal assistants".
OpenClaw just has the most traction despite being a hot mess, because the people hyping it up don't know how bad the codebase is, or because they want to launch something first and switch it over to a more credible alternative after.
Thanks for this suggestion, I installed it yesterday after seeing this comment and this surely is a breath of fresh air! It appears that everything is designed reasonably well from the ground up. It’s more limited, but what’s there works well.
I'm on their lite plan as well and I've been using it for my OpenClaw. It had some issues but it also one-shotted a very impressive dashboard for my Twitter bookmarks.
For the price this is a pretty damn impressive model.
20%? That's a bit insane. This does happen in Europe but is heavily looked down up on and usually quickly corrected.
On the other hand I did get a chewing out from an older guy for having a conversation with friends on a train once, so some people take it perhaps a bit too serious.
It’s very much a thing on US public transit, with the added negative bonus that no one ever confronts the person doing it, because chances are they’re either crazy, armed, or both.
I agree but you have to understand that a lot of European (leaders) still have WW2 in the back of their head.
For them there're far worse things than giving up some freedoms.
One can agree or disagree with this but Europe's actions are far more understandable if you see where they're coming from.
From what it's worth, the younger generation doesn't seem to see this the same way so whatever censure Europe introduces today will most likely be temporary.
I agree but you have to understand that a lot of European (leaders) still have WW2 in the back of their head.
Then they do not understand how or why WWII started. Few people are really interested or care about this - it's treated more as a kind of Aesopian Fable than a historical event.
I am more cynical than you however, I suspect the Eurocrats who use WWII as a censorship justification know full well it has nothing to do with WWII.
> One can agree or disagree with this but Europe's actions are far more understandable if you see where they're coming from.
I think you're falsely attributing this to WW2. Free speech is simply just not part of European culture in the way that it is a part of american culture. The ideal of "free speech" regardless of how well that ideal is implemented in practice is something that is much more instilled in US culture than European culture.
They simply do not give a shit the same way that the US claims it gives a shit about free speech. To them its an afterthought. Nothing to do with WW2 and the trauma of it.
Worth pointing out the modern American conception of freedom of speech is super recent. It only really became a thing in the 1970s. Before then, restrictions on porn, film, even written materials on controversial subjects like abortion could and were regulated.
The 1st Amendment is old, but the way it's applied today is quite radical compared to how it was applied for most of American history. The US being so free speech isn't much older than the median American is.
It's very weird, all these online laws and regulations seems like its an attempt to reduce the cost of policing by making the platforms a police force and I don't like that. If nazis gather on a platform, go get them or keep eye on them. It's even better than pretending that there are no nazis because you were able to silence them. Known cunts are much easier to deal with than cunts undercover, seriously why push people undercover? Let them speak, if that speech increases their numbers then you must work on your speech.
Some update broke the OpenRouter integration and I haven't been able to fix the issue. I took a quick look at the code, hoping to narrow it down and it's pretty much exactly what you would expect, there's hidden configuration files everywhere and in general it's just a lot of code for what's effectively a for loop with Whatsapp integration (in my case :)).
Not to mention that their security model doesn't match my deployment (rootless and locked down Kubernetes container) so every Openclaw update seemed to introduce some "fix" for a security issue that broke something else to solve a problem I do not have in the first place :)
I've switched to https://github.com/nullclaw/nullclaw instead. Mostly because Zig seems very interesting so if I have to debug any issues with Nullclaw at least I'll be learning something new :)
I really enjoyed using Claude but the ever changing limits, weird policies (limited to Claude Code, you can't run Openclaw, etc) made switching a very easy choice.
OpenAI simply provides more value for the money at the moment.
You're totally allowed to use Claude for OpenClaw and you're totally able to use Claude Code with non-Anthropic models. You must be referring to the fact that you have to use an API key and cannot use the auth intended for Claude-only products, which AFAIK is the same at every AI company (with Google destroying whole Google accounts for offenders most recently).
Used codex cli (5.4) for the first time (had never used codex or gpt for coding before - was using Opus 4.5 for everything), and it seems quite good. One thing I like is it's very focused on tests. Like it will just start setting up units tests for specs without you asking (whereas Opus would never do that unless you asked)-- I like that and think it's generally good. One thing I don't like about GPT though is it pauses too much throughout tasks where the immediate plan and also the more outward plan are all extremely well defined already in agents.md, but it still pauses too much between tasks saying, next logical task is X, and I say yeah go ahead, instead of it just proceeding to the next task which Id rather it do. I suppose that is a preference that should be put in some document? (agents.md?)
well I have a running model (ha!) in my head about the frontier providers thats roughly like this:
- chatgpt is kinda autistic and must follow procedures no matter what and writes like some bland soulless but kinda correct style. great at research, horrible at creativity, slow at getting things done but at least getting there. good architect, mid builder, horrible designer/writer.
- claude is the sensitive diva that is able to really produce elegant code but has to be reminded of correctness checks and quality gates repeatedly, so it arrives at something good very fast (sometimes oneshot) but then loses time for correction loops and "those details". great overall balance, but permanent helicoptering needed or else it derails into weird loops.
- grok is the maker, super fast and on target, but doesn't think deeply as the others, its entirely goal/achievement focussed and does just enough things to get there. uniqiely it doesn't argue or self-monologue constantly about doubts or safety or ethics, but drives forward where other stuggles, and faster than others. cannot conenctrate for too long, but delivers fast. tons of quick edits? grok it is. "experimental" stuff that is not safe talking about... definitely grok.
- gemini is whatever you quickly need in your GSuite, plus looking at what others are doing and helping out with a sometimes different perspective, but beyond that worse than all the others on top.
- kimi: currently using it on the side, not bad at all so far, but also nothing distinct I crystallized in my head.
Tried using 5.4 xhigh/codex yesterday with very narrow direction to write bazel rules for something. This is a pretty boiler-plate-y task with specific requirements. All it had to do was produce a normal rule set s.t. one could write declarative statements to use them just like any other language integration. It gave back a dumpsterfire, just shoehorning specific imperative build scripts into starlark. Asked opus 4.6 and got a normal sane ruleset.
5.4 seems terrible at anything that's even somewhat out-of-distribution.
I got it to build a stereoscopic Metal raytracing renderer of a tesseract for the Vision Pro in less than half a day.
It surprisingly went at it progressively, starting with a basic CPU renderer, all the way to a basic special-purpose Metal shader. Now it’s trying its teeth at adding passthrough support. YMMV.
The limits are what did it for me. They kept boasting about Opus performance and improvements, practically begging me to try it out, and when I did, it totally obliterated my usage. I'm sure its good, but I stick to Sonnet because I've been burned bad. Never had that problem with ChatGPT, but it turns out they're just unprincipled and evil, which is a shame.
Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.
because gemini, despite what stats say, still produces garbage once the problem gets harder. it nails it for lab conditions, but messy reality or creativity or even code quality is a far cry from opus or the latest gpt5.4 by a long shot. and always has been. its pretty good inside the GSuite because of integrations, but standalone its near worthless compared to even grok-code-fast which doesn't think much at all (but damn it is fast). At this point google keeps throwing noodlepots with AI against every wall in reach to see what sticks, which is more kind of desperation that still works to increase wall street highscores, but not exactly a streak or breakthrough. just rapid fire shotgun launches to see if anything sticks. No one serious talks Gemini because its not even worth considering still for real things outside shiny presentations and artificial benchmarks.
Gemini schools the other two when doing code reviews.
I used to think tokens are a commodity, but it’s becoming clear that the jagged frontier is different enough even for the easiest use case of SWE that there’s room for having two if not three providers of different foundational models. It isn’t a winner takes all, they’re all winning together. Cursor isn’t properly taking advantage of the situation yet.
My experience exactly. The more "real" the problems become, the more other models become unsuitable when compared to claude, with the sole exceptions being deepseek/kimi, which while speaking strictly w.r.t metrics and basic tasks are not better, they are more interesting and handle more odd and totally out of domain stuff better than the US models. An example being code i wrote for a hypercomplex sedenion based artififial neural network broke claude so bad it start saying it is chatgpt and cant evaluate/run code. similar experience for all US models, which are characterized by being extremely brittle at the fringes, though cladue least among them. Meanwhile chinese models are less capable for cookie cutter stuff but keep swinging when things get really weird and unusual. It's like US models optimize for the lowest minima acheivable, and god help you if distribution changes. Chinese models on the otgerhand seem to optimize for the flattest minima, giving poorer quality across the board but far more robust behaviour.
I tend to use LLMs more for research then actual coding, so I ended up going with GPT over Claude because it's chat interface just seems to work better for me. It balances out Claude being slightly better at software tasks.
What a baffling comment. Aren’t you aware of why this exodus is happening? (It’s not related to “value for the money”!) What are your feelings on that part?
Whatever Anthropic might or might not do with the department of war interests me in proportion to how much I can influence this. Rounded, speaking as an European citizen, that appears to be exactly 0 to me.
If anyone thinks Anthropic or OpenAI are the "good guys," they've already lost the plot. If you look at additional reporting on the topic, not just the Anthropic PR spin, the disagreements were much more nuanced than it was portrayed by Anthropic. They aren't exactly a reliable narrator on the topic either. In fact it actually just seems like Amodei fumbled the deal and crashed out a bit. He's already walked back his internal memo, and is reportedly still seeking a deal with the Pentagon. I don't trust either CEO, I use their products, but if you're even leaning 51-49 on who is "less evil," I think you're giving too much slack.
ever tried living while simultaneously deciding to only patron groups that strictly morally and ethically align to your own personal beliefs?
I would love to, but a practical look at that concept seems practically impossible.
My .02c : Claude was already involved in underhanded shit I don't want a part of[0] and that generated little ethical response from Anthropic , i've had better luck as a 200/mo tier customer with ChatGPT, and I don't really think that Dario claiming that their newest LLM is conscious[1] on a market schedule is all that ethical, either.
Why paint the choice as black and white? Most people are doing the best they can morally even if they don't get it 100% right. Even living 60% in accordance with your values is better than 50%. Likewise, bucketing organizations as good or bad misses the same nuance. Choosing something that is slightly better is has positive consequences despite it not being 100% good.
not the poster, but I guess thats kinda american thinking that actually believes voting with your wallet will make any difference in this late stage crony capitalism in a post-facts world.
realistically: AI WILL get used in military and for killing autonomously, like it or not, believe it or not. I am also against that in principle but I do accept the fact my opinion just doesn't matter and practice radial acceptance or reality as-is. twitter/X is also alive and kicking, despite musk and anti-musk-hate. xAI/Grok is genuinely really good too compared to OAI/Claude, a bit different but very good. At this point all the "outcries" feel like noise I just skip on principle. But it could turn up the fire under the OAI team to go aggressive feature/pricing wise in order to retain/increase their userbase again, which is ... good, after all.
> Everyone involved would be better of with a lower (or negative) income tax instead of subsidies.
That's quite wrong. The low income earners effectively pay no income tax - after deductibles and so on - further lowering the income tax would do absolutely nothing for them.
It'd be an economic and political suicide to lower taxes during high deficits while government money is literally blown up into fine dust in various wars around the world.
Refundable tax credits are a thing the government knows how to do. If a negative income tax law was written to allow refunds to people who owe net negative taxes, the IRS could do it.