site banner

Small-Scale Question Sunday for April 7, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

Since people keep talking about/recommending them, how do you use an LLM? I mean, most everything I search online is paywalled, and the free "AI tools" I've tried weren't very impressive (and ended up either shut down or paywalled)?

Could somebody give some ELI5-level guidance and/or recommendations?

Options:

  • Google's mainstay is Gemini (previously Bard) is free(ish) for now, if you have a Google account. Open it, start writing. Not private.

  • Anthropic pushes Claude. You can try Haiku and Sonnet, the lighter- and mid-weight models free, but Opus was more restricted last I checked. Tends to be one of the stronger fiction writers, for better or worse.

  • Chat-GPT3.5 is available for free at here, 4.0 is a paid feature at the same sight. The paid version is good for imagegen -- I think it's what a lot of Trace's current stuff is using. Flexible, if a bit prudish.

  • Llama is Facebook's big model, free. Llama 2 is also available for download and direct run, though it's a little outdated at this point.

  • LMSys Arena lets you pit models against each other, including a wide variety of above. Again, not private. Very likely to shutter with little notice.

  • Run a model locally, generally through the use of a toolkit like OobaBooga webui. This runs fastest with a decent-ish graphics card, in which case you want to download the .SAFETENSORS version, but you can also use a CPU implementation for (slow) generation by downloading GGUF versions for some models. Mistral 8x7B seems to be the best-recommended here for general purpose if you can manage the hefty 10+GB VRAM minimum, followed by SOLAR for 6GB+ and Goliath for 40+GB cards, but there's a lot of variety if you have specific goals. They aren't as good as the big corporate models, but you can get variants that aren't lobotomized, tune for specific goals, and there's no risk of someone turning it off.

Most online models have a free or trial version, which usually will be a little dumber, limited to shorter context (think memory), or be based on older data, or some combination of the above. Paid models may charge a monthly fee (eg, ChatGPT Plus gives access to DallE and ChatGPT4 for 20 USD / month), or they may charge based on tokens (eg, ChatGPT API has a per 1 million input and output token price rate, varying based on model). Tokens are kinda like syllables for the LLM, between a letter to a whole word or rarely a couple words, which are how the LLM breaks apart sentences into numbers. See here for more technical details -- token pricing is usually cheaper unless you're a really heavy user, but it can be unintuitive.

For use:

  • Most models (excluding some local options) assume a conversational model: ask the program questions, and it will try to give (lengthy) answers. They will generally follow your tone to some extent, so if you want a dry technical explanation, use precise and dry technical terms; if you want colloquial English, be more casual. OobaBooga lets you switch models between different 'modes', with Instruct having that Q/A form, and Default being more blank, but most online models can be set or talked into behaving that way.

  • Be aware that many models, especially earlier models, struggle with numbers, especially numbers with many significant figures. They are all still prone to hallucination, though the extent varies with model.

  • Long conversations, within the context length of the model, will impact future text; remember that creating a new chat will break from previous context, and this can be important when changing topics.

  • They're really sensitive to how you ask a question, sometimes in unintuitive ways.

Is there no way to use Claude 3 Opus' web feature if you don't have direct access to it? I'm trying out Claude 3 Opus through openrouter.ai because Anthropic don't sell to European countries yet, but I can't see any way to get more than the usual text feature. There's image upload as well but it doesn't work perfectly.