Alleged AMD Strix Halo APU Appears in Benchmark

brucethemoose@lemmy.world · edit-2 1 day ago

Yeah, ultimately a lof of devs are trying to make “story generators” relying on the user’s imagination to fill in the blanks, hence rimworld is so popular.

There’s a business/technical model where “local” llms would kinda work for this too, if you set it up like the Kobold Horde. So the dev hosts a few GPU instances for GPUs that can’t handle the local LLM, but users with beefy PCs also generate responses for other users (optionally, with a low priority) in a self hosted horde.

brucethemoose@lemmy.world · 1 day ago

The more you buy, the more you save!

brucethemoose@lemmy.world · edit-2 1 day ago

I just want Elon to release the weights for the new one. You know, like he said he would. It actually seems pretty good (whereas the old Grok was totally useless for the size).

But no… just keep harping on OpenAI not being open while you do the same thing.

brucethemoose@lemmy.world · 4 days ago

Looks shiny and “deep fried” kind of like midjourney?

One open secret in AI land is that everyone trains on everyone elses output (like this, to adopt the popular style). Even if its blatantly against the license. Who’s gonna prove it?

brucethemoose@lemmy.world · 7 days ago

They already couldn’t afford this situation, and look where they are.

What’s an improbable “acceptable risk” to them may not be good enough for NASA, especially if they don’t really understand what’s wrong.

brucethemoose@lemmy.world · edit-2 7 days ago

Same with large black holes, which are (going by their event horizon) less dense than water. Black holes get less dense as they get more massive.

If you drop one into an ocean, the ocean will immediately collapse into an even bigger black hole and mess up space around it.

brucethemoose@lemmy.world · 7 days ago

It’s crazy that Twitter has such an outsized influence on the public, and I think it’s because news outlets amplify it so much.

It doesn’t have that many active users. And news rarely covers other platforms when something makes a lot of noise and reaches many eyeballs.

brucethemoose@lemmy.world · 8 days ago

This doesn’t touch the weights at all, it’s just a change to the sampler.

What lobotomizes their models is cost cutting and trying to make them “safe,” or at least thats what I suspect.

brucethemoose@lemmy.world · 9 days ago

They can cycle a some biases (dozens?) and test them all. Detokenization is super cheap to run, its not AI or anything.

I’m trying to think of a good analogy for how this would work, and I kinda came up with one. This would be kinda like an image encoder that biases itself towards coding RGB values (0-255) as even numbers. Subtly, say 30% odd 70% even.

That’s totally imperceptile to humans. And even a “small” sample of the image would carry this bias if pasted into a larger image verbatim, since the sample size is so large (just as the sample size for a bunch of tokens in text is pretty big.

And I’m not saying its fullproof… but if thats indeed what they’re doing, I think its a decent way to detect “lazy” OpenAI abusers who aren’t working so hard to scramble and defeat it.

brucethemoose@lemmy.world · edit-2 9 days ago

It’s not so trivial if OpenAI cycles the logit bias or makes it really convoluted.

And it’s not like certain “words” or language patterns are more probable with this method, its different than what any kind of human or words based algorithm would detect, which is what I suspect most “anti AI detection” software does.

Its doable… but seems inconvenient for a small business to keep up with. Maybe.

brucethemoose@lemmy.world · edit-2 9 days ago

You have full control of your logit outputs with local LLMs, so theoretically you could “unscramble” them. And any finetuning would just blow that bias away anyway.

OpenAI (IIRC) very notably stopped giving the logprobs of their models. They did this for many reasons, and most of them boil down to “profits” and “they are anticompetitive jerks,” but another reason is to enable watermark methods just like this.

Also, thing about this is that basically no one uses self hosted LLMs compared to OpenAI (or really any API) LLM.

brucethemoose@lemmy.world · edit-2 9 days ago

This has been known in the ML space forever. LLMs don’t actually output words/tokens, but probabilities for a long list of tokens, and the sampler picks one (usually the mostl likely token). And if you arbitrarily weigh these probabilities (eg 50% of possible token outputs are more likely than the other 50%, as a random example), it creates a “signature” in any text thats easy to measure. The sampler randomizes it a tiny bit, but that averages out in long texts.

It’s defeatable. I’m sure if you maken enough OpenAI queries, you can find the bias. I think a paper already tackled this. But this likely will stop the lazy absures, aka 99% of abusers, who should just use some other LLM if they really care.

Another open secret in LLM land is that OpenAI is actually falling behind open research efforts, hence its hilarious it took them this long to implement something so simple.

brucethemoose@lemmy.world · 12 days ago

It shoudn’t be finetuning, if anything it should be RAG with an embeddings model + regular inference.

This is kinda cool, but it still doesn’t seem to justify bogging down a machine with a huge LLM. And I am speaking as a massive local LLM enthusiast who uses them every day.

brucethemoose@lemmy.world · edit-2 13 days ago

Mozilla management was paid millions to develop a new “vision” of a theoretical future with AI chatbots

Is this llamafile?

The thing about LLMs is that no one knows how to write the ultra low level optimizations/runtimes, so they port others (llamafile largely borrows from llama.cpp AFAIK, albeit with some major contributions from their own devs).

Performance is insanely critical because they’re so darn hard to run, and new models/features come out weekly which no sane dev can keep up with without massive critical mass (like HF Transformers, mainly, with llama.cpp barely keeping up with some major jank).

So… I’m not sure what Mozilla is thinking here. They don’t have many of those kind of devs, they don’t have a GPU farm, they’re not even contributing to promising webassembly projects like mlc-llm. They’re just one of a bazillion companies that was ordered to get into AI with no real purpose or advantage. And while Gemma 2B may be the “first” model that’s kinda OK on average PCs, we’re still a long way away from easy mass local deployment.

Anyway, what I’m getting at is that I’m a local LLM tinkerer, and I’ve never touched or even looked at anything from Mozilla. The community would have if anything of theirs was super useful.

brucethemoose@lemmy.world · edit-2 13 days ago

In before Meta buys Mozilla, lol.

Zuckerberg is on a “spoiling other tech giants with Facebook money” streak.

brucethemoose@lemmy.world · 13 days ago

that should terrify anyone who cares about a free and open internet.

We’re way past that… right? They’ve already kinda destroyed the open web and are squeezing it for money.

brucethemoose@lemmy.world · 13 days ago

The one thing Discord is good at is engagement, aka pinging people on their phones, repeating conversations that have been answered a million times, getting people drawn into rambling discussions…

Yeah it’s kind of a nightmare lol.

brucethemoose@lemmy.world · edit-2 13 days ago

It’s the most popular alternative, simple as that. There’s (sadly) no where else obvious the average community knows to go.

brucethemoose@lemmy.world · edit-2 13 days ago

How about “monopolizing the ad industry, going all in on manipulative engagement farming, and making billions off of destroying the internet they helped create, and possibly democracy”

I think the headline proves most people don’t really understand how deeply Google has embedded themselves in, and farmed, well, everyone.

brucethemoose@lemmy.world · 13 days ago

Wouldn’t private subs solve that?

brucethemoose@lemmy.world · 24 days ago

Alleged AMD Strix Halo APU Appears in Benchmark