Sem Karaman
AI & ML interests
Recent Activity
Organizations
Quantization model context length issue in LM Studio
So far my top choice
Wait, I should confirm if get_weather needs a location.
If I don't know the location, I can't provide it if it's a required parameter.
However, if I can't ask the user (because I'm in the middle of execution), I'll call it and see.
If it returns an error asking for location, I'll reply to the user asking for it.
But usually, these bots have a default location or use IP.
I'll proceed.This is both syntactically and grammatically correct, and logically coherent. Furthermore, other models can tool call down to Q2 quants and more. So I don't think it's a quant-level issue. The problem is training: it's better trained on natural language grammar and logic than it is on tool calling.
the thing is though, this is completely new for 3.6 35B, qwen has been the orchestrator in my stack for very long time now and I've gone through many iterations of qwen, this is kinda new to 3.6 35B
going off what JoeSmith said, it might just be a training issue with this particular iteration of Qwen, if you look at the official benchmarks posted by Qwen team it shows 3.6 is almost always under 3.5 or only a few points above, so perhaps the hyper over 3.6 is overstated
me too, if I could fit it on my GPU. but I highly doubt it's the quantization. 3.5 35B and all sorts of merges of it works perfectly fine at pretty much any quant
i see, in that case if you already tried sampling (repeat penalty 1.05-1.1) and quant seems fine then only other possible fix that worked sometimes for me is a direct system prompt to 'not overthink' and 'allow knock on errors after first solution'
also i think it's a given that you should be running 0.2 temp for 'precise' responses
i would try Q8 as last resort
I see, this is usually because you are using Q4 quants locally? Try to go up to Q5 or Q8, reduce GPU layers and context length to fit into you VRAM.
oh 3.6 35B is a literal never ending reasoning loop for me. like 3 out 6 times need to kill the server type of deal
post your sampling setup