site banner

Small-Scale Question Sunday for November 26, 2023

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

4
Jump in the discussion.

No email address required.

4060 Ti has 16gb of vram for 13b/20b LLM models with certain quantization. I can’t find a peer card that isn’t used with the same capacity.

Yeah, that's fair. Most of the stuff I've seen with LLMs either pushes for tiny models at low quantization for speed, or goes all the way to 65b on CPU for intelligence, but I'm sure there's a lot of use cases in the middle.