site banner

Small-Scale Question Sunday for April 7, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

Since people keep talking about/recommending them, how do you use an LLM? I mean, most everything I search online is paywalled, and the free "AI tools" I've tried weren't very impressive (and ended up either shut down or paywalled)?

Could somebody give some ELI5-level guidance and/or recommendations?

Options:

  • Google's mainstay is Gemini (previously Bard) is free(ish) for now, if you have a Google account. Open it, start writing. Not private.

  • Anthropic pushes Claude. You can try Haiku and Sonnet, the lighter- and mid-weight models free, but Opus was more restricted last I checked. Tends to be one of the stronger fiction writers, for better or worse.

  • Chat-GPT3.5 is available for free at here, 4.0 is a paid feature at the same sight. The paid version is good for imagegen -- I think it's what a lot of Trace's current stuff is using. Flexible, if a bit prudish.

  • Llama is Facebook's big model, free. Llama 2 is also available for download and direct run, though it's a little outdated at this point.

  • LMSys Arena lets you pit models against each other, including a wide variety of above. Again, not private. Very likely to shutter with little notice.

  • Run a model locally, generally through the use of a toolkit like OobaBooga webui. This runs fastest with a decent-ish graphics card, in which case you want to download the .SAFETENSORS version, but you can also use a CPU implementation for (slow) generation by downloading GGUF versions for some models. Mistral 8x7B seems to be the best-recommended here for general purpose if you can manage the hefty 10+GB VRAM minimum, followed by SOLAR for 6GB+ and Goliath for 40+GB cards, but there's a lot of variety if you have specific goals. They aren't as good as the big corporate models, but you can get variants that aren't lobotomized, tune for specific goals, and there's no risk of someone turning it off.

Most online models have a free or trial version, which usually will be a little dumber, limited to shorter context (think memory), or be based on older data, or some combination of the above. Paid models may charge a monthly fee (eg, ChatGPT Plus gives access to DallE and ChatGPT4 for 20 USD / month), or they may charge based on tokens (eg, ChatGPT API has a per 1 million input and output token price rate, varying based on model). Tokens are kinda like syllables for the LLM, between a letter to a whole word or rarely a couple words, which are how the LLM breaks apart sentences into numbers. See here for more technical details -- token pricing is usually cheaper unless you're a really heavy user, but it can be unintuitive.

For use:

  • Most models (excluding some local options) assume a conversational model: ask the program questions, and it will try to give (lengthy) answers. They will generally follow your tone to some extent, so if you want a dry technical explanation, use precise and dry technical terms; if you want colloquial English, be more casual. OobaBooga lets you switch models between different 'modes', with Instruct having that Q/A form, and Default being more blank, but most online models can be set or talked into behaving that way.

  • Be aware that many models, especially earlier models, struggle with numbers, especially numbers with many significant figures. They are all still prone to hallucination, though the extent varies with model.

  • Long conversations, within the context length of the model, will impact future text; remember that creating a new chat will break from previous context, and this can be important when changing topics.

  • They're really sensitive to how you ask a question, sometimes in unintuitive ways.

Thanks! Maybe you'll mind answering two questions: About using local models, can it be tweaked so it doesn't forget context so easily? Maybe using learning runs on previous conversations? How does chatGPT retains context? I understand it does multiple processings for each prompt, and does lossy compression of previous chat history. How to simulate in in API?

You can finetune models on your personal data or information, but that only does so much. If you're more technically inclined, you can try setting up Retrieval-augmented generation, where the model queries and existing database and tries to answer based off the knowledge there and not just what it came baked in with.

Don't ask me how that can be done, but I know it's a thing. My PC isn't good enough to fuck around with the local models worth using, courtesy of Nvidia and their stingy amounts of VRAM.

How does chatGPT retains context?

I presume you're not talking about nitty gritty algorithmic details (which would be the self-attention mechanism IIRC) and instead mean how it continues a conversation or remembers details about a user?

Well, the official implementation has a "memory" feature where it gets to remember tidbits about your preferences as a user, as well as some relevant personal details like location.

The way it works is that the entire conversation is fed back to the model, with specific signs that tell it when it or the user was speaking, and it'll resume where the user left off. I think the API works this way by default, but my OAI credits expired ages ago, so if it seems to be treating each input as a fresh prompt, you need one of the many frontends available that use your API key and then handles the matter of copying back your entire conversation, which is tantamount to the model "remembering" it.

Is there no way to use Claude 3 Opus' web feature if you don't have direct access to it? I'm trying out Claude 3 Opus through openrouter.ai because Anthropic don't sell to European countries yet, but I can't see any way to get more than the usual text feature. There's image upload as well but it doesn't work perfectly.

Great post, but I'm consistently bemused that people forget that GPT-4 and DALLE-3 are freely available through Bing. I'm sure Microsoft is too.

Pick your poison, ChatGPT or Claude Sonnet. Go to their website, make an account with username and password, do a mobile verification and give them your email, promise not to do anything bad... and that's it. You just ask the machine your questions: "What are some good names for an enormously long armoured train?" and have a conversation with it: "Why do they [spiders] have eight eyes if their vision isn't so great, while birds have good eyesight with two?" You can ask it for code, stories, translation, just about anything you might ask a human (except anagrams and certain kinds of wordplay) and you get a pretty decent response, albeit it may make things up or give you blather.

It's like using microsoft outlook or gmail in your web browser, you're not downloading stuff unless you're an advanced user with a very powerful PC.

If you want the best models, you have to subscribe to Claude Opus or GPT-4, lesser models are free with ratelimiting. It's no harder than using netflix really.

do a mobile verification

Does that require having a smartphone?

It's no harder than using netflix really.

Well, given that I've never used them, or any other streaming service (particularly as my Internet isn't good enough to support them), this isn't exactly a helpful comparison.

@self_made_human you’re the one I’d ask, here.

I have been summoned.

Well, Ranger has done a pretty good job, but if you particularly want GPT-4 (or a fork) for free, then you can use Bing Copilot, either through the Bing app or your browser.

You need to sign up with your email (I don't recall if it needs to be a Microsoft account or using Outlook), and there you go. You flip a toggle to make it use GPT-4, and you're set.

Pretty KISS, and it's the best free LLM out there, being on par with GPT-4 as served through paid OAI ChatGPT.

Everything else that good or debatably better requires money, or a willingness to find shady discord bots or sign up for research previews and so on.

Summarizing dense, relatively inscrutable material when I don't have the time to read it for myself.

Quick diagnosis of issues with minor appliances or mechanical issues. I don't trust them for medical purposes. Describe the year make and model of your car and any weird noises or behavior it is making.

Song recommendations (i.e. here's a list of songs I like, give me more on this theme)

Nigh instantaneous proofreading and editing of professional letters, documents, or similar.

Quick feedback loop for brainstorming ideas or quickly 'prototyping' concepts you have had difficulty envisioning or expressing.

Right now they don't have much 'agentic' uses (I'd really like one that can e.g. order pharmacy refills or schedule oil changes for me) but I think the basic capabilities are there.

I think you misunderstood my question. I don't mean a list of use-cases like this, I mean how do you do these things you list?

Explain like I'm five — or, alternately, a sixty-something Boomer. What URL do you enter in your browser, or software do you download. How do you use the resulting page or program?

Go to www.chatgpt.org/chat, you won't have to sign up or log in. Then type in the box.

I don't find them enormously useful, but two specific uses for me:

  1. when reading a textbook, ask it to explain some passage (e.g. on a weird C++ detail) (usually helpful)
  2. write code in some language/framework, given approximately that code in another language/framework (hit or miss)

Again, not what I'm asking.

What LLM do you use to do these? How do you access it? How much does it cost? How, in detail, do you do these things you list?

ChatGPT and Bard, mostly the former, unpaid version, through browser. Sample transcript on the diamond problem in C++ multiple inheritance: https://chat.openai.com/share/603e851e-daa0-4c4b-81d7-1b8332897694