CPU-only i7-1355U koboldcpp works surprisingly well

@raffa · 2 months ago

CPU-only i7-1355U koboldcpp works surprisingly well

NSFW

@raffa · 2 months ago

My first test was with Starcannon-Unleashed-12B-v1.0-f16, a 23Gbyte model. I did not expect that laptop to be usable at all.

@magn418 · edit-2 2 months ago

I think doing the calculations at full precision (FP16) is a waste. You should try somewhere between the Q4_K_M version to Q6_K (or at least Q8_0, that’s supposed to the same quality as FP16). That way it should be considerably faster… At least twice as fast.

(The GGUF page of that model has a list of recommended quantization levels.)

@raffa · 2 months ago

thanks for the tips!