ControlsFreak
No bio...
User ID: 1422
the Christian-Socialist-Democratic-Party/Liberal-Unionist-Secession-Party/Green coalition in [Euro country] is breaking down over the question of whether state pensions should cover ceiling fans.
I would love a full comment explaining this one.
I don't know what you're talking about. You gave a nice definition and the properties of your definition. I'm not even asking for more at this point. Yes, I did ask you to say at least something about what your words mean, because if you can't manage to explain it at all, it's highly likely that you're confused about your own words. But at this point, I'm just looking at your nice definition/properties and observing that you solved your own problem from before. This is good news! This is wonderful news! Shouldn't you be happy that you had a problem before, and now you've solved it? The "inherent tension" in your philosophical positions has evaporated! That's the whole point of this OP.
They're very not new to me, but apparently, they're pretty new to you, because you thought that this was a very serious issue for you. But now you've solved your own problem, in like a quarter of a second. Record time in philosophy! Just needed a common sense and consensus definition of evil!
I need to go hunting on SMBC, because he had to have made a comic about this. If not, he really needs to.
That is, I'm pretty sure you've just solved your problem of evil, in quite the unique way.
It certainly seems logically plausible that whatever god may have created the universe, at the time that he/she/it created the universe, thought, "Hmmmm, I wonder if it would be evil to create a universe where eventually, one day, maybe, depending on how things go, a two year old will get ALL?" Perhaps this deity looked around, took an opinion poll to gauge the vibes, determined from the (presumably otherwise empty) room that it seemed a-ok, and proceeded to create said universe. Guess that just wasn't evil, by a common sense and consensus definition of the term.
This dovetails a bit with my footnote below about figuring out what "box" a person's world is. CSRs have scripts for the majority of the issues that they see on a regular basis. Task number one is to figure out whether your issue fits within one of their scripted boxes. If so, you're probably in good shape. If not, then individual quality can vary substantially. I've had multiple experiences where, after determining that my situation did not fit their script, it was very apparent that it would be important to get a person whose box extended beyond the scripts and included the knowledge/intelligence sufficient to work the problem. I've had times where, for example, they told me they could solve the problem, but they could not explain how the steps would work well enough that I was comfortable proceeding. A hang-up and a call back later, and I got someone who was very capable of conceptualizing the problem properly, taking a few minutes to work through how a solution would work, and (critically) explaining how it was going to work. Whether a simple call back to another Tier 1 CSR will get you that type of person versus having to fight to get to a Tier 2 person may vary.
When I talk about consensus morality, I'm talking today.
This definition is valid at, like, every snapshot point in time, then, yes? The same action could be "evil" at one point in time and "not evil" at a different point in time?
Those are all necessary and sufficient conditions in your definition of evil? We can go through them one by one, but maybe let's just start with the last one. If, uh, someone (who?) isn't "willing to enforce" a "preference", then it's, uh, not evil to go against it? What even is "willing to enforce"? Like, does the enforcement need to be realized? Can it be weighed against other things? If the someone (who?) is like, "Yeah, I'm willing to enforce this, but due to other considerations (other priorities, something inherently difficult about detection or enforcement, etc.), I'm not going to put too much time and effort into it," does that still count for determining whether something is evil or not?
Whence a consensus that evil means "in bad taste"? I guess perhaps you're not incorporating consensus at this level of generality, so are you instead just asserting that your definition of evil is "something done in bad taste, as measured by some vibes about a consensus" or something?
I'm using a common-sense or consensus definition of evil
What's that? Whence consensus?
@P-Necromancer I think I'd like to bundle these two, as they're getting at a similar thing.
I agree with what you both say. Plenty of humans will come up with ridiculous things to do, or even just things that might make sense but have problems, and if you're not supervising them appropriately, they may just do their things. But that's like, the essence of technical debt?
For the example of fixing some OS issue, imagine I didn't have really any technical knowledge of how things work (say, I don't really even know what the registry is unless a tech/LLM tells me something about it). Maybe I'd take my computer to a human tech. Could even be a corporate IT guy. Perhaps, knowing that I don't have a clue, I just give it to him. "Here's my problem; please fix it Ralph Rufus."
Who knows what he'll get up to? What stuff he'll mess with along the way. Things he'll try just because, and then maybe leave it in a changed state, even though it didn't progress toward a solution to the actual problem. This cruft can build up. After years of having this corporate IT guy and that corporate IT guy and the other corporate IT guy just doing who knows what, maybe at some point, things get bizarre enough that the next one says, "Dude, stuff is wild here; we probably should just wipe it and clean install."
That makes sense, and it's utterly routine in the world with humans. I hear my wife tell me about weird stuff that's broken on her work computer... and even weirder stuff that whatever IT guy she talked to did. She doesn't have a clue what's going on. I get it.
I also agree that as of right now1, the best is when you know enough about what's going on that you can get it to explain things and are able to then understand it, yourself. Get it to document things fully, provide a suite of tests, have a back-and-forth. It can provide tons of utility!2
...but, if you genuinely lack enough knowledge to be a competent participant of that back-and-forth, it still may let you "just do stuff". There can still be tons of utility here, as it may still get things right a lot, and folks who have had some problem that they've wanted to fix for ages and could never get the time with a competent human and certainly couldn't figure it out on their own will be able to fix many of those problems, and it will be wonderful. It may also, occasionally, along the way, build up technical debt.
Note that I'm not saying that this is some unique problem that is fundamentally different from dealing with humans. Instead, I'm now conceptualizing it in the same way that I conceptualize human-driven technical debt. I think that dovetails well with both of your descriptions. If there is a downside, it's probably that many folks who wouldn't have ever tried to fix that OS problem or make that code will now do it, and they might be building up technical debt while they're also accumulating utility. They may choose to do it a lot, and they may jump into it with both eyes shut. This may still be the right choice! They may still get more utility from all the wins than they lose from either discrete bad events or built-up cruft.
This is a conflict, a tension, which is why I said that I was, indeed, conflicted. I'm am still neither an "LLM good" or "LLM bad" person.
1 - I continue to take no position on the question of to what extent future progress will render this concern de minimis.
2 - To briefly respond to the 'shouldn't you just hang up on a human customer service agent who you can tell is going to be unhelpful', yes. Absolutely. I didn't bother with the specific issue of it getting hung up on deleting the registry value, because I was close enough that hearing it append its bad idea one more time wasn't important to me. I did mention that I used multiple LLMs, and that was part of it; I left out every twist and turn of the story, but yeah, I not only just scrapped the prior context; I even just jumped to different models. This is a useful skill to have, when dealing with humans and LLMs. Even when dealing with some human professionals, my life changed long ago when I realized that I could grasp some understanding of what their "box" of the world was, and once I realized that my situation was outside of their "box", I just moved on from them. But the concern here is that you have to have just enough knowledge about the thing to be able to gauge where their box is, when you're outside of it, or when they're going off the rails. There are a lot of people who don't have that with humans, and they're not going to have that with the many many more things that they're going to want to do with LLMs. I don't have that with all sorts of different humans or things that I might want to do with LLMs.
Yeah, I chose not to, because of course, the goalposts will be moved to, "You should have used my preferred LLM instead." I just mentioned that I used multiple different ones, multiple different companies. Thinking always. Not $200/mo. Of course, someone will just say, "You won't have any problems if you pay $200/mo for my preferred LLM." Maybe? I even note that they will perhaps get better! Yes, they're all getting better, even the cheaper ones. They get better as do the expensive ones. But will expensive ones still produce technical debt? Why do you think they will or will not? I don't know if they will! I'm saying that I don't know. You seem to be implying, but not even stating that you know (or how you know) that they certainly won't, if only you pay enough or wait an unspecified period of time.
I'd note that a common feature of your style of comment is that you immediately accuse your interlocutor of "dimmish (sic) the utility of LLMs". But I didn't do that! I said that there were ways in which they provided quite a bit of utility! Imagine having a discussion about any other technology like this. "You know, this nuclear science stuff is pretty cool. Can provide a lot of energy for cheap. Miiiight be worried about some possible dangers that might come up, like, ya know, bombs or stuff." "Why don't you tell us exactly what device you've been using in your own experiments?!?! Why are you trying to dimmish the utility of nuclear science?!?!" Like, no dawg, you just sound like you're not paying attention.
I commented recently about my personal experience using LLMs for work-related math stuff. I found that it wasn't great at giving me a whole proof (or really, much of a part of a proof) without error, but it helped me with some idea generation and pointing me to tools that I wasn't familiar with. To be fair, I haven't yet gotten access to any of the ones that are supposed to be hooked up to automated theorem provers, so maybe they'll work better (I've signed up for one, but their system wasn't working at the time; starting this post prompted me to try again, and I was able to get in; maybe I'll find time to really test it soon).
I guess I'd just like to report some experience with LLMs for other computer stuff. I had an extremely minor issue with one of my PCs. I wondered if LLMs could help. Through the course of this, I tried using multiple different LLMs.
The good is that it did have some good ideas for how to get started, and possible causes of the issue. I may have caused a bit of a false start off the bat, because rather than really consider the multiple ideas that it gave me, I thought, "Yeah, I could totally see X being the problem; maybe I should just do that." It was easy for me to think that I could just do the likely fix; it's normally an easy thing to do, and there's zero harm if it wasn't actually the cause of the problem. However, it turned out that my specific system has a surprisingly stupid design, and it was going to be a much greater pain to do it. So I resigned myself to hoping that it was one of the other root causes suggested by the LLM in the meantime, and I'd come back to the first idea later if I could confirm that it really was that.
The extra good is that, in hindsight, I am very sure that it was, indeed, one of the other root causes. So thankfully, I didn't waste too much time on the false start. However, once I began to implement my preferred fix, something strange was going wrong.
This is where we get into the bad. In diagnosing what was going wrong with the attempted fix, it got allllll into mess that was actually pretty low probability. Suggested permissions issues, suggested problems with registry entries. A couple of them were low risk, and at the time, seemed like they could be plausibly related, and I did mess with a couple things. Others were the ugly. No, Mr. Bot, I am not going to just delete that registry value (especially after I did a little non-LLM side research on what that registry value actually does).1
In the end, when I told it that I was balking on doing what it wanted me to do, it suggested that I could, in the meantime, do one of the standard procedures in a different way. Of course, it thought that doing this would just be a step toward me ultimately having to delete that registry value. But I figured trying this alternate procedure at the very least couldn't hurt, and indeed, it helped by giving me an actual error code!
The LLM thankfully helped me decode it (likely faster than a google search), which allowed me to adjust my fix. This was actually the key step, after which, I was able to understand what I think was going on and manage later hiccups. Unfortunately, the LLM didn't grasp this. It still was set on, "Great! Now you're ready to delete registry values!" Sigh.
After I adjusted my fix, I was able to get another (unrelated) error code from another step in the process. This time, I actually tried a google search for the error code first, and it came up empty, but the LLM told me exactly what it was (and it made sense), which was very nice and convenient. One final adjustment, and I think I have it working just fine.
The only remaining bad point is that the LLM still didn't realize that we'd fixed the problem! It still was all, "...and now you're ready to delete stuff in the registry!!!" I told it multiple times that the thing that was broken which was motivating it to think of deleting the registry value was no longer broken. Didn't matter; it really wanted to nuke that thing.
It all still leaves me quite conflicted. It was great in doing some idea generation and decoding error messages. But man, does it leave me scared to think about all the people who are just giving LLMs free rein to take actual actions in their computer. I focused here on the registry key issue, but there were more things along the way that it came up with that left me thinking, "...no, I'm pretty sure I don't want to mess with that unless I've got a lot more information and confidence about what's going on." If I had just said, "Go fix this, Ralph Wiggums2," who knows what sort of bollocks it would have done to my system. This worries me, because I hear all these people talking about how great it is that they can just tell their LLM to go change whatever it thinks is necessary to go fix whatever problem on their computer... and they really think they're rapidly approaching a world (if they're not already there) where they'll be happy to give it full access to just do anything to it.
It also dovetails with the worries about vibe coding. Forget about changing some OS settings; they're actually choosing to run arbitrary code on their system that is generated by an LLM. Yes, some folks do rock solid sandboxing, but let's be honest, if you're making anything that you or anyone else is going to actually use, it's not going to stay in a sandbox for long. I listened to a podcast this week, where one of the hosts, midshow, was like, "Yeah, I had this LLM make this program. I'm gonna have it add email functionality." And he just did it, live on air. Sandbox? Schmandbox. It now sends emails. What's it actually doing along the way? Who knows? He didn't check any of the code; of course he didn't. He wanted to see it send an email while he was still live.
"Technical debt" is the phrase that went through my mind in thinking about these experiences together. Yes, I was poking around at permissions/registry; sometimes, those things genuinely just get messed up. I've had experiences where my permissions have just gotten borked for completely unknown reasons; sometimes, I've been able to fix them; sometimes, stuff like that happens and you get to the point of, "This thing has been running a long time, and who knows what the long history of stuff has been, when this or that may have gotten corrupted; better to just wipe the OS and install clean." The term is more traditionally used with coding, when stuff has just gotten glommed on, piece by piece, and at some point, it's better to just throw it all away and invest in a clean slate rather than continuing to maintain the old mess. You can glom on email functionality to your vibe code in a few sentences and about twenty minutes. You don't need to think about whether that may be accruing technical debt.
Maybe the LLMs will keep getting better, and it'll be even easier to clean slate stuff in the future, so the pain of accumulating technical debt won't be as bad. But man, I can't help but think that a lot of people are unknowingly setting themselves up, both in their systems and in their vibe code. That one day, they'll just say, "This is broken; I don't know why; it's a mess of stuff that LLMs have globbed onto it over years; just go fix it, Ralph," and it will just do whackier and whackier stuff to their system/code that is already so whacked out that it just doesn't fit the mold of training data used to train the LLM.
1 - FTR, it was actually super relevant to be at least looking around in the registry, and doing so helped me understand what was going on.
2 - For those who haven't heard yet, this is the name for a technique where you tell the LLM to do something, and you set up a loop to repeatedly prompt it to keep working and doing stuff "until it's DONE".
- Prev
- Next

We might need to come up with a catchy cartoon name for this strategy, otherwise it will lose the memetic war to bumbling Ralph Wiggums.
More options
Context Copy link