Has anyone ever done a proper security audit of VLC that is downloaded from the web? I don't trust it, and the fact that their releases on Github don't include binaries makes me trust it even less. Nobody is compiling VLC from source, and they don't provide any sort of provenance from the GH actions pipeline.
Look at the supported formats lists. It includes so many parsers, mostly written in C, which means there probably are a few dozen ways to exploit the player.
Thinking time is not the issue. The issue is that Claude does not actually complete tasks. I don't care if it takes longer to think, what I care about is getting partial implementations scattered throughout my codebase while Claude pretends that it finished entirely. You REALLY need to fix this, it's atrocious.
Do you guys realize that everyone is switching to Codex because Claude Code is practically unusable now, even on a Max subscription? You ask it to do tasks, and it does 1/10th of them. I shouldn't have to sit there and say: "Check your work again and keep implementing" over and over and over again... Such a garbage experience.
Does Anthropic actually care? Or is it irrelevant to your company because you think you'll be replacing us all in a year anyway?
Or, ask it to make a plan, and it makes a good plan! It explicitly notes how validation is to take place on each stage!
And then does every stage without running any of the validation. It's your agent's plan, it should probably be generated in a way that your own agent can follow it.
Whenever Anthropic has an opportunity to do what's right, they go the opposite way. For example, their source leaks, and instead of open-sourcing it like people have been asking to happen for years so they can contribute fixes because Anthropic doesn't care to maintain their own software, they tighten the noose further.
If it isn't obvious by now, this problem is only going to get worse. The only reason we have subscriptions still is because they're waiting to pull off the biggest bait and switch in history. Don't get sunk on this ecosystem, or you're in for a world of pain in the future. As has always been the case; competition and open-source are our only hope.
They're just removing it from public access and selling it to big money instead. Think large advertising companies, government agencies, Coke-Cola, Hollywood, etc. The scary part is now that they've removed it publicly, it's going to be harder to keep a pulse on what is real and what is fake. We can't trust any video, audio or text content now.
If a model was trained on <|begin_text|> <|end_text|> and you change the tokens passed to <|start_text|> <|end_text|>, it loses several 'IQ points' if it can even answer back at all anymore.
Synthetic data is fine. Synthetic data on very similar questions generated based on the description is typically fine. But once the shape of what you're training on gets too close to the actual holdout questions, you're getting an uplift that's not realistic for unseen tasks.
Does it matter though? If it accomplishes the task, it accomplishes the task. Everyone uses a harness anyway, and finding the best harness is relevant. Also perhaps this hints at something bigger, i.e.: we're wasting our time focusing on the model when we could be focusing on the harness.
I feel like they should be legally responsible for providing scanning infrastructure for this sort of thing. The potential economic damage can be catastrophic. I don't think this is the end of the litellm story either, given that 47k+ people were infected.
reply