site banner

Small-Scale Question Sunday for April 7, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

Thanks! Maybe you'll mind answering two questions: About using local models, can it be tweaked so it doesn't forget context so easily? Maybe using learning runs on previous conversations? How does chatGPT retains context? I understand it does multiple processings for each prompt, and does lossy compression of previous chat history. How to simulate in in API?

You can finetune models on your personal data or information, but that only does so much. If you're more technically inclined, you can try setting up Retrieval-augmented generation, where the model queries and existing database and tries to answer based off the knowledge there and not just what it came baked in with.

Don't ask me how that can be done, but I know it's a thing. My PC isn't good enough to fuck around with the local models worth using, courtesy of Nvidia and their stingy amounts of VRAM.

How does chatGPT retains context?

I presume you're not talking about nitty gritty algorithmic details (which would be the self-attention mechanism IIRC) and instead mean how it continues a conversation or remembers details about a user?

Well, the official implementation has a "memory" feature where it gets to remember tidbits about your preferences as a user, as well as some relevant personal details like location.

The way it works is that the entire conversation is fed back to the model, with specific signs that tell it when it or the user was speaking, and it'll resume where the user left off. I think the API works this way by default, but my OAI credits expired ages ago, so if it seems to be treating each input as a fresh prompt, you need one of the many frontends available that use your API key and then handles the matter of copying back your entire conversation, which is tantamount to the model "remembering" it.