site banner

Wellness Wednesday for September 7, 2022

The Wednesday Wellness threads are meant to encourage users to ask for and provide advice and motivation to improve their lives. It isn't intended as a 'containment thread' and any content which could go here could instead be posted in its own thread. You could post:

  • Requests for advice and / or encouragement. On basically any topic and for any scale of problem.

  • Updates to let us know how you are doing. This provides valuable feedback on past advice / encouragement and will hopefully make people feel a little more motivated to follow through. If you want to be reminded to post your update, see the post titled 'update reminders', below.

  • Advice. This can be in response to a request for advice or just something that you think could be generally useful for many people here.

  • Encouragement. Probably best directed at specific users, but if you feel like just encouraging people in general I don't think anyone is going to object. I don't think I really need to say this, but just to be clear; encouragement should have a generally positive tone and not shame people (if people feel that shame might be an effective tool for motivating people, please discuss this so we can form a group consensus on how to use it rather than just trying it).

14
Jump in the discussion.

No email address required.

Now that the servers are well again, I'll repost this explanation here!

We're using Kubernetes, giving us the whole Treat Your Servers Like Cattle, Not Pets thing. Kubernetes allows us to dispose of old servers and start up new ones pretty much immediately; if we do run into load problems, or optimize the site to the point where we no longer have load problems, I can just switch the backend hardware around and everything is solved.

This does require that Kubernetes knows everything about the servers in a way that lets it restart. Earlier, I was doing some cleanup of old pre-stable-site configuration and I deleted the wrong thing; I took out one of the bits required for the database server to start. This didn't break the site because the database server had already started; Kubernetes just said "uh-huh, everything is fine here, no problems" and kept on trucking.

Later, and annoyingly right after I went to bed, our host decided they wanted to do a server swap - they probably had a rack failure or something - and so Kubernetes dutifully noticed that our server had vanished, returned it to the pool, spun up a new server, and tried to restart everything.

At which point it sat there saying "hey, I can't start the database server. Help, please."

And I was in bed.

But this actually wasn't the only issue. I did a writeup on the startup pains we had. A quote:

As near as I can tell, there is a switch on the GUI. But this switch is also overridden by some settings in my configuration. Importantly, it's overridden irregularly; sometimes you'll do something, and it'll say "oh shucks, gotta go check that switch!" Because I hadn't realized this, it went and checked it and dutifully turned it off again.

I think I've fixed that now.

Nope! Hadn't fixed it.

I think I've fixed it now. But I might not have.

Later tonight I'm going to intentionally fake a server change in the same way it happened today. With luck it'll just work, without luck I'll fix it manually and then give it another try.

Kubernetes is truly the regular expressions for the new century. The only thing that kept me sane with k8s was relentless automation. No manual switches anywhere, no "let me just kubectl apply the correct secret". 100% of the changes go through gitops 100% of the time.

Yeah, I honestly believe it's a better paradigm and a far better model.

But god, it's like the worst possible implementation of that model.