Hacker Newsnew | past | comments | ask | show | jobs | submit | scratchyone's commentslogin

is GLM genuinely comparable to claude models? haven't had a chance to test it yet.

does google actually host anthropic models themselves?? surprised anthropic allows that, given how notoriously crazy they are about distillation or weight leaks or any hints of their models being used in the wrong way.

Yes, we host it ourselves, acting as the data processor which can be important for enterprise customers.

From developer experience, hosting them ourselves allows us to take advantage of our unique infra and deliver fastest time to first tokens of the providers.


maybe, but the response to GPU shortages being increased error rates is the concern imo. they could implement queuing or delayed response times. it's been long enough that they've had plenty of time to implement things like this, at least on their web-ui where they have full control. instead it still just errors with no further information.

I've been experiencing a good amount of delays (says it's taking extra time to really think, etc), and I'm using during off-peak time.

i notice that as well. most of the time when i see those it has a retry counter also and i can see it trying and failing multiple requests haha. almost never succeeds in producing a response when i see those though, eventually just errors out completely.

Coding is a problem solved. Claud writes the code. I edit it. I code around it.

Engineer roles dead in 6 months.


> I edit it. I code around it.

You're never gonna guess what software engineers do.


Because of the context I would think this is sarcasm, but I am not sure.

It is.

This is super nice, thank you for including an AI disclosure. I would probably normally avoid something like this bc I would be considered by how much code oversight there is. Very nice to know that it's overseen properly by a human. Installed it and it's quite nice!!! great work :)

It would also be nice to be able to hide the checkbox it adds to the homepage. also disabling show focus box doesnt actually seem to work?


> It would also be nice to be able to hide the checkbox it adds to the homepage. also disabling show focus box doesnt actually seem to work?

This is fixed and being pushed now. Thanks!


is there anything interesting in the unstable branch??

also lmfao bikeshedding a custom language is EXACTLY what i would expect from the dev of that kinda nerd game. feels like a good sign tbh


I think its called yakshaving when you actually do it. Bikeshedding is discussing stuff that doesn't matter to avoid tlaking about the hard stuff

Last I checked there's some rebalancing that looks pretty good, but the real advantage from the custom language is a huge performance boost.

tbh they really didn't, tinygrad's was clearly a joke response. they were not providing a real uptime target.


I mean more fundamentally, if they have access to even more advanced models than all of us and have this much downtime, does that imply that their models are possibly not so great at software dev?

But yes you're definitely right, it's perhaps more ironic than contradictory.


Agreed. Having some level of human input makes a submission at least meaningful. If the entire repo and all text is generated by an LLM, does it really matter if the human is the one posting the link? It's functionally indistinguishable from automated spam.


For what it's worth, there are modern LLM detectors with extremely low false-positive rates. The tech has advanced quite a bit since the ZeroGPT days. Personally I've gotten very good results from Pangram Labs. Still can't directly ban people though because false positives are always possible.


Are they great at detecting normal prompts that don't try to make the LLM speak non-LLM-ishly? If you make the LLM not use em dashes, "it's not; it's" phrases and similar things, and if you make it make a few mistakes here and there, would it still be detected? My point is that if people aren't trying to hide their LLM use, it might work, otherwise it probably wouldn't. How would a detector tool work against output where the prompt tells the LLM to alter the way it writes? Or if the LLM output is being modified by another LLM specifically designed to mimic certain styles?

Like, why would my comment (or yours, or any other comment) pass or fail the LLM check the I/you/someone else used specific prompts or another LLM to edit the output? It seems like these tools would work on 99.9% of the outputs, but those outputs likely weren't created in an adversarial way.


Is that false-positive rate from your own testing, or the author's claims? What is the source of ground truth?


CARROT has this and it’s amazing! You can “time travel” back as far as you want. Absurdly far, even. I can tell you that it was 20 degrees in my town on Jan 1st, 1940.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: