I work as an engineer and I have been trying to build a second-brain repository of information in a database that I can use as context for an LLM to query. Something like Perplexity.ai, but on a local machine because I'd be working with company data. In the last week or so I've uploaded PDFs of whitepapers, books, and industry standards. So far, I've found that the LLMs are miles ahead of plain text search. The platform provides citations so the underlying references can be found quickly.
I am using a standard install of Open WebUI and ollama on a Linux machine. I've tried various smaller models (Deepseek-r1, Phi-3, Phi-4) and have been generally successful, but for larger models find that I just don't have enough computing power. I am comfortable installing and setting up software in a terminal, but I have no formal coding/software development background. So if I can do this, you can too.
Next week I plan to upload several years' worth of e-mails into the database, and see if I can run queries against it.
I wonder how much information can I upload into a database before I start running up against constraints?
I've been tinkering with LLMs recently.
I work as an engineer and I have been trying to build a second-brain repository of information in a database that I can use as context for an LLM to query. Something like Perplexity.ai, but on a local machine because I'd be working with company data. In the last week or so I've uploaded PDFs of whitepapers, books, and industry standards. So far, I've found that the LLMs are miles ahead of plain text search. The platform provides citations so the underlying references can be found quickly.
I am using a standard install of Open WebUI and ollama on a Linux machine. I've tried various smaller models (Deepseek-r1, Phi-3, Phi-4) and have been generally successful, but for larger models find that I just don't have enough computing power. I am comfortable installing and setting up software in a terminal, but I have no formal coding/software development background. So if I can do this, you can too.
Next week I plan to upload several years' worth of e-mails into the database, and see if I can run queries against it.
I wonder how much information can I upload into a database before I start running up against constraints?
More options
Context Copy link