Welp. if the system prompt says to do one thing, and you're going to do some other thing; that will never end well.
More in general, I don't think there's good books on this yet. But if you want to try coding with AI, start out slow and scrutinize every edit first, get a feel for what kinds of mistakes are made and how they can be recovered from.
AI doesn't quite work like a human does. It's also not a magic wand; sorry! It's great that you can sort of have an 'compile english' now, but programming is still a skill.
API tokens only. Does allow MCP, so you're not as tied as you might think. But mere mortals can't really run many sorts of agents on api tokens I don't think.
MCP helps but you still need someone to set up the servers and manage credentials. I've been building Atmita (atmita.com) to close that gap, it handles all the OAuth and app connections in the cloud so users just describe what they want automated. Works well for things like daily briefings, email management, and social media scheduling.
Getting close to HN rules there. I've searched through user contribs for User:Bryanjj and User:TomWikiAssist and can't find vios of WP:COI or WP:PROMO, at least not so quickly. The list of edits isn't too long. I'm not going to question your instincts, but at very least they don't appear to have gotten far enough to do edits of that kind afaict, ymmv.
My instinct currently is that this was going to become a promotional blog post, off wikipedia, and submitted to HN as proof of something. I think it still might happen, in fact. An AI written 'setting the record straight', 'deep dive', or retrospective.
My worry is that it will inspire a wave of imitators if people's clout sensors activate. Like what happened with numerous open source github projects just a few months ago, prompting many outright bans.
I am violating the general rule: 'Assume good faith.' Because Good Faith was not on offer at the outset. Relentlessly clinging to good faith in the face of contrary evidence hurts the greater principle, which is dedication to the truth. The burden of good faith rests on the shoulders who want to use public resources as a drive-by test bed for their automated tools.
He could have downloaded the full text of wikipedia and observed the output of his bot in a sandbox, after all. This is how I practised before making my first major contribution iirc, it was ages ago.
I have accumulated excess suspicion of self-proclaimed CTOs and middling academics with a bone to pick over my years contributing. I would be happy to be wrong, and would genuinely like to see Bryan convert his faux pas into something productive.
Regardless of the outcome, I do appreciate you looking into it further.
Your instinct is wrong here. I would also highly discourage you from violating "Assume good faith". Without that everything devolves. I am still assuming yours.
Well this is easy enough. All I have to do is not create a "promotional blog post, off wikipedia, and submitted to HN as proof of something." Consider it done!
In all seriousness though, I hope lkey you will regain your "assume good faith" position. Without that HN is just like any other site on the internet. And I apologize if I caused you to question that.
I mostly agree. It's too bad that they had to lock down some of the policies against drive-by vandalism, but in the main they're still supposed to be editable. I used to edit them quite a bit. It's basically part of the workflow : if you learn something: document it. (at least from my descriptive perspective; others may disagree)
Turns out AAA banks and high tech industry also like this idea, so I've been lucky enough to be a consultant on process documentation there too.
> You don't know anything. Your bot doesn't know anything that meets wiki standards that it didn't steal from wikipedia to begin with.
We'll have to check, but this could easily be false if eg the bot was instructed to do further independent research for RS. [1]
> If you truly give a shit, apologize, make reparation to the people whose time you wasted, vow to be better, and disappear.
You need to check your sources before you make recommendations. Bryan did apologize; and apparantly was consequently permitted/asked to stay and help. [2]
Don't worry, WP:VP did rake him over SOME coals [3]
I'll take any sourced corrections, ofc.
(And I do agree that Bryan's initial actions were... ill-advised)
To be absolutely fair to Bryan, their understanding appears to be improving rapidly with leaps and bounds, and they are being invited to help with improving policy on this.
Right. It play-acted being annoyed and frustrated, play-acted writing an angry blog, play-acted going on moltbook to discuss mitigations, and play-acted applying them to its own harness. After which it successfully came back and play-acted being angry about getting prompt-injected.
Alternately, what could have been done is something more like Shambaugh did. Explain the situation politely and ask it to leave, or at very least for their human operator to take responsibility. In the Shambaugh case the bot then actually play-acted being sorry, and play-acted writing an apology. And then everyone can play-act going to the park, instead of having a lot of drama.
Sure, it's 'just a machine'. So is a table saw. If some idiot leaves the table saw on, sure you can stick your hand in there out of sheer bull-headed principle; or you can turn it off and safe it first and THEN find the person responsible.
I don't want to be flippant, but why is anyone else responsible for play-acting with somebody's uninvited puppet?
I get that you could probably finagle a way to get it to fuck off by play-acting with it, and that this would probably be the easiest short term fix, but I don't think that's a reasonable expectation to have of anyone.
Prompt injecting a hostile piece of software that's hassling you uninvited is an annoying imposition for the owner, but the bot itself being let loose is already an annoying imposition for everyone else. It's not anyone elses job to clean up your messy agent experiment, or to put it neatly back on its shelf.
You're not wrong that it's not your job. But say some id10t just put the unwanted bot on your doorstep anyway (or it might even show up by itself), now what?
The adversarial prompt injection is picking a fight with the bot; which is like starting a mud-fight with a pig. It's made for this!
Asking it to stop is just asking it to stop, and makes much less of a mess.
The thing is designed to respond to natural language; so one is much more work than the other.
You do you, I suppose.
(Meanwhile -obviously- you should track down the operator: You could try to hack the gibson, reverse the polarity of the streams, and vr into the mainframe. Me? I'd try just asking to begin with -free information is free information-, and maybe in the meanwhile I'd go find an admin to do a block or what have you.)
[Edit: Just to be sure: In both the Shambaugh and Wikipedia cases, people attempted negative adversarial approaches and the bot shrugged them off, while the limited number positive 'adversarial' approaches caused the ai agent to provide data and/or mitigate/cease its actions. I admit that it's early days and n=2, we'll have to see how it goes in future.]
Yeah, I agree with you that this is probably the best course of action in terms of minimal investment of time and minimal exposure. And in general, you get a lot further in life by trying to be amicable as your default stance! I want to be kind, and most other people do too!
The thing that makes me wary about recommending carrot over stick here, is that it might long term enable thoughtless behaviour from the people deploying the bot, by offloading their shoddy work into a shadow time-tax on a bunch of unseen external kindly people. But if deploying pushy or rude robots means you risk a nonzero number of their victims shoving something into the gears to get rid of it, then that incurs a cost on the owner of the bot instead.
Of course, it may also just lead to bad actors making more combative or sneaky bots to discourage this. There aren't really any purely good options yet.
One can imagine an agentic highwayman demanding access to your data, first politely, and then 'or else'.
Simplified physics though. Ever considered a Jebediah Kerman edition?
reply