site banner

Tinker Tuesday for June 24, 2025

This thread is for anyone working on personal projects to share their progress, and hold themselves somewhat accountable to a group of peers.

Post your project, your progress from last week, and what you hope to accomplish this week.

If you want to be pinged with a reminder asking about your project, let me know, and I'll harass you each week until you cancel the service

1
Jump in the discussion.

No email address required.

Starting a greenfield project to build a stable software environment for ai coding. I can't think freely when operating within legacy codebases. So starting from scratch to build my ideal ai-native scaffolding.

I'm starting with a tool to auto-redact pdfs. Simple, useful, well constrained. Would appreciate suggestions on software paradigms that have worked well for ai development.

Stack:

  • Python FastAPI backend + Streamlit Frontend + Sqlachemy ORM over rel-db
  • uv for packaging + environment
  • firebase for cloud provider + github actions for ci/cd
  • prefect (or something similar) for orchestration
  • Openai codex + github copilot as my LLM coding friends
  • Dockerized deployments

Some ideas:

  • Monolithic codebase to make it easy for agents to operate on it
  • Minimize implicit everything (state, side effects)
  • Maximize explicit everything (types for everything, explicit validators)

I have a basic demo ready. Codex is already raising PRs. The redacted bounding boxes are off. And the LLM redaction logic is wonky. But, so far I am impressed at the LLM's ability to build a greenfield project by itself.


I'm a serviceable software engineer. Cracked engineers of the motte, what are some software systems paradigms that you think I should play with ? I would especially like to know paradigms that make it easier for agents to understand, write & verify auto-generated code.

Sounds pretty good. I'm a noob engineer so can't offer any feedback whatsoever. My only experience with llms and pdfs was when we tried to build something that dealt with large pdfs and the biggest hurdle was the tables. I've heard pcr got better in the latter half of 2024 and I stopped my startup LARP right about that time.

Haha, yep, tables and rich extraction is pretty bad out of the box.

In this case though, I can confidently say I'm an expert on PDF extraction for llm use.

I can confidently say I'm an expert on PDF extraction for llm use.

ANy tips and tricks you picked up regarding this not available out there on the web? I basically just throw the most powerful vision model at it and YOLO it.