Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size - Firethering

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 29 days ago

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size - Firethering

Che's Motorcycle@lemmygrad.ml · 28 days ago

I might try this out next week. Tired of burning my monthly token allowance in Cursor in a couple weeks. :D

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · edit-2 28 days ago

If you have the memory, I can highly recommend Qwen3.6-35B-A3B-Q8. It’s hands down the best local model I’ve tried. It only loads 3b params in memory too, so should run with 16gb, or you can drop to a lower quant too.

Che's Motorcycle@lemmygrad.ml · 28 days ago

I think I tried qwen3.6 but the 8B version, and that tanked my 16GB. But I’ll give the smaller one a shot!

CriticalResist8@lemmygrad.ml · 28 days ago

Deepseek v4 pro! Top up your credit as you go and they’re having a sale until May 31st, but even without the sale 1M output tokens is “only” 3.48. Flash is only 0.28 per 1M output.

Che's Motorcycle@lemmygrad.ml · 28 days ago

Not sure if I could swing Deepseek at my job tho. Surprisingly, Cursor still comes with Kimi2 as model option, so there’s that.

Che's Motorcycle@lemmygrad.ml · 28 days ago

Yep, it works on my machine. 😎

I’ll compare it with the 3B qwen3.6 next week