site banner

Culture War Roundup for the week of May 8, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

5
Jump in the discussion.

No email address required.

Yet More ChatGPT Stuff

Let me begin by saying up front that this message may read a bit oddly because I'm trying to keep some information out of it in hopes of keeping what remains of my fragile veneer of online anonymity.

Okay, background on me that is relevant here, and stabs my aforementioned fragile veneer of online anonymity in the back with a steak knife. I've just finished my 1L year at a top-50 law school in the United States. It was challenging, but not as challenging as a lot of law students like to say. Bitching and complaining are two of a law student's favorite things to do, but speaking as someone who spent a few years in the workforce before coming to law school I can say with no small degree of certainty I absolutely prefer law school to a 9-5.

But anyway. For anyone not familiar, law school, much like an undergraduate institution, runs on the semester system. My fall semester concluded in December of last year. Before finals season we all got an email from the Dean of Students promising fire and brimstone if we even dreamed about cheating on an exam. These warnings were in the usual fashion. No phones, no internet unless permitted (to some professors "open book open note" means your book, your notes, to others it means "Google? Sure why not."). The usual. Given the average educational level of the Motte I'm sure most of you received these emails or something almost identical twice a year for many years. My spring semester ended in April, and once again we were given the fire-and-brimstone email, this time with a twist. Among the most absolutely verbotten things that we must never-ever-ever do was access ChatGPT during an exam. Now I have to admit, this was something of a surprise. I suppose it shouldn't have been, but a surprise none the less.

Then one of my professors said something interesting. He couldn't give us permission to access ChatGPT, but in his opinion the ban was absolutely useless for his class. He said he'd played around with it, and was completely convinced it could not pass one of his exams. Certainly not. I actually quite like this professor, he's an engaging speaker, clearly passionate, appreciates intelligent disagreement, and is just a very kind person. I genuinely think he went and tested it, and decided it couldn't pass. I don't think he was being a blow-hard, that's just not the kind of person he is, at least in my judgment.

Now, the format of a law school exam, for those who are unfamiliar, generally follows the same basic model. You are given a fact pattern that varies in complexity from the fairly straightforward, to the reasonably realistic, to the completely outlandish. You are then asked to analyze it. Sometimes it's an extremely broad question like:

Identify all legal issues from this course that you can find in this fact pattern, and analyze them.

Which in the case of Torts is a notoriously painful proposition. Other times, you're given a slightly more narrow question like:

You have been hired as Mr. Smith's counsel. Evaluate the claims against him, potential defenses, and possible counter-claims.

Which I realize seems very similar, but when you have four pages of dense fact pattern and only 90 minutes in which to finish this section before you need to be moving on to the next, those seemingly minimal boundaries are very helpful. Then sometimes professors will give you a very narrow question.

You are the newest Assistant District Attorney for Metropolis. Your boss has asked you to evaluate the case against Mr. Smith with an eye toward filing murder charges. Ignore all other potential charges, a different ADA is working on them.

Generally speaking, what your professors are looking for is for you to "issue spot." They're not really interested in your legal analysis, though of course it needs to be at least credible, and they're almost never looking at your grammar or spelling. What they want is for you to show that you're capable of spotting what a lawyer should (in theory) be able to spot. What parts of these facts match up to the law we've studied in this class? Emphasis on the "in this class" part, I've heard horror stories about students evaluating potential civil liability in criminal law exams. Which I'm sure they did an excellent job of, but again. Ninety minutes before you need to be moving on to the next section or you won't finish in time, and you're not getting any points for talking about tortious trespass when you should be talking about whether or not you can charge common law murder or manslaughter.

I'm getting to the ChatGPT stuff I promise.

Anyway, this professor gave us the background facts in advance. Why? Because there were more than twenty pages of them. Agonizingly detailed, dense, and utterly fascinating if you enjoyed the class like I did. Not the questions mind you, just the fact pattern. But given the fact pattern, you can generally get a sense of what the questions will look like. After all if your professor spent several pages talking about someone shooting someone else, you're probably not going to be asked to analyze the facts for potential burglary charges. So I read the facts, figured out roughly what my professor was going to ask, and then...

Went in and took the exam like a good noodle without trying to use ChatGPT.

What? I'm training to be a lawyer. We're supposed to be risk-adverse.

But after the exam, well things are different. I still had the fact pattern, I remembered roughly what the questions were, and it was no longer a violation of the honor code to use ChatGPT. I checked. Thoroughly. So I spent some time copy and pasting every word of that 20 page document into ChatGPT, and then asked it something fairly analogous to the first question on the exam.

It spat out junk. Made-up citations, misquotes, misunderstanding of the black letter law, but, in that pile of garbage, were a few nuggets of something that looked fairly similar (if you squinted and turned your head ninety degrees to the left) to what I'd written on the exam. Now, I'm not going to toot my own horn here. I'm no budding legal genius. I will never be on the Supreme Court, I probably won't be a judge, I doubt I'll make it on to law review. But I am confident that I am somewhat above median. Not far above median, but law school grades on a very strict curve. Professors are given an allotment of grades. Something like "you can give at most 5 As, 10 A minuses, and 15 B pluses, anything below that at your discretion." So if the top 15 students on the final exam (which is 90-95% of your grade) were five 99s, and ten 98s, then the 99s all get As, and the 98s all get A minuses. The poor bastard who only got a 97 gets a B plus. It is hard to achieve a high GPA in law school. Conversely, it is very hard to do worse than a B minus (predatory law schools excluded). Anyway the point is that I know that according to my (above the median) first semester GPA, I am above the median. Not brilliant, but top half of the class.

So I started poking at it. I fed it the actual citations it was trying to make up based on my class notes and outline (read: study guide - no idea why but in law school study guides are called outlines), informed it of previous court rulings and the actual holdings that were relevant to the analysis, and then asked it the same question again.

Suddenly it was spitting out a good answer. Not a great answer, it was still way too short on analysis, but it correctly identified sticking points of law, jurisdictional issues, and even (correctly) raised a statute I hadn't fed it, which was a surprise. It must have been part of it's training material. But the answer was still way too short. So I hit it with a stick and told it to try again and make it longer. Then I did that again, and again, and again. The hitting with a stick was really just telling it "write this again but longer, add more analysis, focus on the section about [whatever wasn't fleshed out enough]." Almost no effort on my part at all. Eventually I ended up with about five hundred words of actually pretty decent issue spotting and analysis.

Now, do I think that this was good enough to get an A? I doubt it. Good enough to get a median grade? A nice solid middle of the pack B? Yes. It could, I think, get a B, which is most definitely a passing grade.

The obvious caveats to all of this are manifold. I'm not actually a lawyer yet, so I have no idea how good my understanding of what good legal analysis looks like is. I also don't know how I did on my exam yet, so it's entirely possible I completely misunderstood an entire semester-long course that I sincerely enjoyed and am about to get the only C (you have to try to get lower than a C) in the whole section. I don't think that's likely (see supra "I am somewhat above median") but it is absolutely possible. This only keeps me up at night a little bit.

The further caveat is that law school is nothing like the practice of law, something that has been repeated ad nauseam by every lawyer I have ever met. So this is not me saying that ChatGPT is capable of performing as a lawyer. But it has taken the first step. Law school is supposed to teach you how to think like a lawyer, at least in theory. There's another theory that it's three years of extremely expensive hazing born out of nothing more than tradition, but let's assume for the moment that it actually does teach you how to think like a lawyer. ChatGPT is capable of, in a minimal sense, and with some poking and prodding, thinking like a lawyer.

Edit: apologies, I wrote this very early in the morning and forgot to include that I was using the free 3.5v, not 4.

PART 1/2

PART 2/2

Speak of the devil and he shall appear.

I was made aware of a paper, by law school faculty, grading GPT-4, published yesterday, on their law school exams. While this paper didn't come from faculty at my law school, it did come from faculty at another T-50 law school. Their evaluation? Con Law C, Crim C-, Law & Econ C, Partnership Tax B, Property B-, Tax B. They noted that:

We found that it produced smoothly written answers that failed to spot many important issues, much like a bright student who had neither attended class often, nor thought deeply about the material.

As I mentioned in my previous post, it is hard to do very poorly on a law school exam. Scoring 21st out of 26 students in Income Tax netted it a B. Getting a mere 11 out of 20 points in the Property Exam's multiple choice section put it in the 18th percentile, but the median score of actual law students was just 13.1. Of note, this professor does not state whether they have this policy but in my (brief) experience most professors will permit students to explain their multiple choice answers on the exam. Getting the wrong answer, but for sound reasoning, will permit partial credit. Pointing out an uncertainty or lack of clarity in the question may entitle you (and others) to full credit. One of the other exams (Income Tax) first asked simply for the answer to a multiple choice question, and then later for it to elaborate on its reasoning. Elaborating on the reasoning caused an increase in score (8/17 to 11/17), which indicates either that GPT does better when it is forced to elaborate on its reasoning (unlikely) or that the graders take reasoning into account (more likely).

Also of interest is that very little prompt engineering occurred. While the authors of the study acknowledge that prompt engineering is an important part of getting the right answer, only minor tweaks or adjustments were made. If they, like students perhaps motivated to cheat on an exam, had taken the steps of trying to force out the right answer, I think it's likely that v4 could have spat out some much better answers, and gotten better grades.

I don't really have a full analysis of this paper yet, mostly because I'm still struggling to figure out exactly how much of it was motivated by an attempt to see how good this thing really was, and how much was motivated by an attempt to slap down those pesky peasants who keep saying that ChatGPT is going to replace lawyers, or at the very least render dramatic changes unto law school. The stated intent of the paper was to provide basic analytics and assist with other professors in determining whether a student used ChatGPT to cheat on an exam. I'm not really convinced the second goal was achieved. Their stated observations were:

It often refers to doctrines by alternative names not used in class, as with “vested rights” rather than “nonconforming uses.” It sometimes spots entirely valid issues—on topics not actually covered in the course. GPT-4 produces unusually smooth and organized prose, often with helpful headers, numbering, and summaries. Hopefully these tendencies will help professors spot answers written by GPT-4 or similar models

Many (most?) law schools use a software called Exam4 for finals. This software can do things like lock down internet access, and prevents copy/pasting from outside the software itself. Though to my personal appreciation it does allow copy/pasting inside the document, which has let me easily reformat some clumsy paragraphs in previous exams. It also provides spell-check. So given the limitations of Exam4, a student who is motivated to cheat doesn't even need to read this paper in order to already beat most of these observations. Referring to a doctrine by an odd name? The student would likely correct that on their own. Topics outside the call of the question or the class itself? No gunner would waste the time to type those into the Exam4 software. Headers and numbering? Those aren't getting used by a pressed-for-time student. The only thing that remains is "smooth" prose which is hardly an indicator of cheating, unless for whatever reason the student used ChatGPT on one section, and not on the others, leading to a clear difference in writing style. But bear in mind again that the student will have to type each individual word that they want to use across to the Exam4 software, which would likely lead to transcription errors, rewording, and reformatting, thus muddying the waters.

Anyway I was pleased to see this come to light so quickly after my original post was made.

I already use chatGPT 4 in my work, in only a limited fashion so far. Sometimes I feed it text and ask it to revise it, or sometimes I treat it as a superior version of Wikipedia and ask it questions about DNA analysis (I know not to trust its answers at face value but it's invaluable as a starting foundation). When it comes to playing around with AI, I'm already way ahead any of my colleagues and I was flabbergasted when I met a few of them that were my age that somehow never played around with chatGPT or its ilk.

There's a lot of tasks I expect to fully outsource to chatGPT. The one I'm most thrilled about are using it to look up cases and synthesize caselaw from disparate scenarios, and using it to write briefs directly applicable to the fact scenario I give it. That alone will save me countless tedious hours. But I'm not at all worried about my entire job being replaced, and not because I'm deluded enough to think I'm irreplaceable.

There's a scene from the 1959 movie Anatomy of a Murder where they show the defense attorney perusing through the shelves of a law library. Back in the day, if you wanted to look up cases, you had to crack open heavy tomes (called case law reporters) where individual decisions were catalogued. One of the perennially vexing issues with legal research in a Common Law system is to keep track of which cases are still considered "good law", as in whether or not they've been abrogated, overturned, reaffirmed, questioned, or distinguished by a latter case opinion or a higher court. Back in the day, this was impossible to do on your own. If you found a case from 20 years ago, it's flatly not possible to read through every court case from every appellate level from the last 20 years to see if any of them pruned the case you're interested in.

The solution was created by the salesman and non-lawyer Frank Shepard in 1873 when he started cataloguing every citation used by any given court case. These indexes would then be periodically reviewed and Shepard would sell these sticky perforated sheets that you could tear off and stick it on top of the relevant case inside a reporter compilation. These lists would tell you at a glance where else this case was cited, and whether it was treated positively or negatively. The procedure back then was, whenever finding a relevant case, to then consult Shepard's index and ensure it was still "good law." Every legal database has this basic feature nowadays but to this day the act of checking whether a case is still good is referred to as Shepardizing.

Consider also what transpired before "search" was a thing. Here too, legal publishers rushed to fill the gap and created their own index of topics known as "headnotes", typically prepared by lawyers who are experts in their respective fields. The indexes they created was sometimes nonsensically organized and they often missed issues, but overall if you wanted to find all cases that addressed say for example "damages from missed payments in the fishing industry" looking up headnotes was obviously much better than just sifting through a random tome.

Legal research has gotten way easier with searchable databases available to everyone and job expectations have gone up in proportion. This tracks developments elsewhere. I don't know what explains the rapid rise of serial killers throughout the 70s and 80s, but the decline isn't that surprising: it's just so much harder to crime and get away with it nowadays. A murder investigation in the 1950s might get lucky with a fingerprint but would otherwise be heavily reliant on eyewitness testimony and alibi investigations (this is part of a long tradition and explains why trials and rules of evidence revolve so much around witness testimony). Now, a relatively simple cases generates a fuckton of discovery for me to sift through: dozens of cameras, hundreds of hours of footage, tons of photographs, a laser-scan of the entire scene, contents of entire cell phones, audio recordings of the computer aided dispatch for the previous 12 hours, and on on and on.

All of this can fit nicely on my laptop and though I can ask for help, I'm generally expected to have the tools to pursue this case on my own. After all, I don't rely on a secretary to type up the briefs I dictate nor would I need a paralegal to organize hundreds of VHS tapes. The advancement that seems obvious to me is that our workload expectations will just go up, with the accurate understanding that modern tools make it easier to handle more.

@Supah_Schmendrick referenced a comment of mine on how averse courtrooms are to technology. It's true that there is an aversion to technology, and I'm already encountering some panic among local public defense leadership wanting to completely ban chatGPT. I've had to patiently explain to them that this is a reflexive overreaction, completely unenforceable, and also likely to be moot as big tech continues to jump on the bandwagon with products like Microsoft Copilot. I don't think that aversion will last long though, because the benefits are so blatant here and way too valuable to pass up, and part of the argument I made to local leadership is that prosecutors and law enforcement are definitely already using LLMs to assist with tediousness. Supah_Schmendrick's point about interpersonal relationships is also worthwhile, and I would add that an identifiable individual ordained to be a legal expert is useful as a measure of accountability. The ability to say "I consulted with a lawyer" will continue to have weight in ways that "I asked chatGPT" won't.

[tagging @self_made_human also]

That's an interesting anecdote, I think lawyers are almost uniquely positioned to exploit ChatGPT (the blade cuts both ways, the more of your work ChatGPT can do, the easier you are to replace).

You have the combination of enormous amounts of text to peruse, and the consideration of subtle details and intricacies that are made easier with superhuman attention to detail and patience (or an Adderall prescription), practically a playing ground for a Large Language Model.

Now, I have a mildly jaundiced view of Law as a profession, because IMHO, the fact that a dedicated caste of professionals is needed to simply understand the legal code, let alone the interactions and ramifications therein, seems like a failure of the same. Nothing against individual lawyers though, I recognize the profession is necessary, since it pops up time and again in grossly different nations and time frames.

I expect human lawyers to end up as thin-wrappers for GPT-5 sooner rather than later, with the most entrenched and experienced lawyers capable of leveraging relationships and prestige in a manner that a humble bot simply can't.

Now, replacing judges with LLMs would be the real killer deal, especially if you could assess the outcomes of legal trials before they even went to court if the thought process of the model was clear enough and widely shared. Not that that's going to happen anytime soon, but it would certainly deal with the biggest bottleneck in legal systems.

Perhaps a more feasible intermediate goal would be LLMs as screening tools, with human judges opting to either rubber stamp their decision, or only escalate if they felt it was unsatisfactory.

Now, I have a mildly jaundiced view of Law as a profession, because IMHO, the fact that a dedicated caste of professionals is needed to simply understand the legal code, let alone the interactions and ramifications therein, seems like a failure of the same. Nothing against individual lawyers though, I recognize the profession is necessary, since it pops up time and again in grossly different nations and time frames.

I agree, and my hope is that LLMs make legal issues dramatically more accessible. The legal code is currently written by lawyers for other lawyers but normal people are expected to know and abide by it. I already plug statutes into chatGPT and ask it to explain it to me because I already can't be bothered to machete chop through the dense legalese. I wonder what equilibrium we'd settle in: would law become more understandable thanks to LLMs ability to explain it, or would it become even more complicated thanks to LLMs ability to generate it?

Now, replacing judges with LLMs would be the real killer deal, especially if you could assess the outcomes of legal trials before they even went to court if the thought process of the model was clear enough and widely shared. Not that that's going to happen anytime soon, but it would certainly deal with the biggest bottleneck in legal systems.

I would basically guarantee that judges and their newly-graduated clerks are already using chatGPT to cut down on their workload, but they're going to keep quiet about it.

So my question:

Given the average educational level of the Motte I'm sure most of you received these emails or something almost identical twice a year for many years. My spring semester ended in April, and once again we were given the fire-and-brimstone email, this time with a twist. Among the most absolutely verbotten things that we must never-ever-ever do was access ChatGPT during an exam. Now I have to admit, this was something of a surprise. I suppose it shouldn't have been, but a surprise none the less.

Given how competitive law students are (particularly at high ranked schools) and how aggressive the filtering at such schools is, and how many underhanded tricks many students already employ to gain any and every advantage they can, even potentially risky strategies...

Do you think that any Law Student who isn't using ChatGPT to assist their performance is getting screwed? That is, despite all these warnings, would it make any sense for the class Gunners to ignore this tantalizingly effective tool that could give them the edge they need to beat the curve and graduate in that coveted top 10% of the class?

Like, the pure Molochian incentives, the same ones that drive students to shell out thousands of dollars on supplemental material and tutors, and pop adderal like skittles, SURELY makes it all but inevitable that they will find ways to use GPT4 to squeak out an edge over their peers, until it just becomes an expected part of the experience.

This is a tricky one to answer just because my personal law school experience has dramatically conflicted with the more widely reported experience. The one you hear about constantly over on /r/lawschool or wherever goes something like "I study 30 hours a day but gunners study 34 hours a day and steal casebooks from the library and give out bad outlines on purpose and one of them literally killed a rival's puppy right before an exam just to get an edge." Exaggeration of course, but not by much. In contrast I've found the students where I am are nothing but kind, helpful, and cooperative as a rule. My classmates share outlines, form large and happy study groups, and generally are just nice people. Something like fifty of us all went out for drinks after our last exam and it was a very congenial atmosphere. Even the people I dislike on a personal level (like the ABA-issued class communist) I don't suspect of any malfeasance.

On the other hand, there are enough stories about gunners like that going around that there must be some kernel of truth to them. Where there's smoke there's a fire and all that.

But more to your point, the exact wording you used tickled my brain a bit. The part that goes:

Do you think that any Law Student who isn't using ChatGPT to assist their performance is getting screwed? That is, despite all these warnings, would it make any sense for the class Gunners to ignore this tantalizingly effective tool that could give them the edge they need to beat the curve and graduate in that coveted top 10% of the class?

This is, almost word for word, the exact same thing I've heard about people who get extra time on exams for whatever bullshit they got a doctor to sign off on. ADHD being ever popular since it usually comes with a prescription for the Adderall you mentioned. But of the fifty-odd people in my section, I can count on one hand the number who have gotten extra time. Did some of them abuse the process to get it? Maybe. Have other students done so at other law schools or in other sections? Of course. But it's still not expected. I think ChatGPT will likely end up being the same sort of thing. Unscrupulous individuals will of course use it to try and get that edge. But the vast majority of law students won't, for a variety of personal reasons ranging from the extremely practical "too likely to get caught" to the extremely virtuous "it's wrong." The stage it's at right now I don't think it provides that competitive edge, not really. When v5 rolls out, or when a special legalAI is released to the public it probably will. So to answer your question, for now, no I don't think it makes sense. In another few months that answer may very well change.

My classmates share outlines, form large and happy study groups, and generally are just nice people.

Not that this is necessarily going to change, but what tends to happen after 1L year is once the assortative effect of final grades kicks in and people start angling for for Law Review positions, and they realize that they now have to really plan for how they're going to land after law school, you see the real strivers come out, and anyone who smells a chance at getting Summa Cum Laude will start to pull out all the stops.

Because when all grades are being curved, someone's gotta fall to the back.

The 'nice' thing about GPT is that it doesn't assist them in directly undermining others, but it just seems utterly inevitable to me that certain students will grab hold of any advantage they can.

Indeed, like many legal tools, it might make someone a better lawyer overall by being familiar with it. On the flip side, of course, if they used it as a crutch to get through law school at all (i'm talking someone who would probably have failed out if not for having access to GPT) one wonders if they have any future in the field given that GPT can, presumably, replace them easily.

The bellweather is education. You can ask it clarifying questions in plain English and it will give you a clear, context-specific explanation. If 3 years from now we are still keeping kids in 20-30 person classrooms with teachers teaching lesson-plans on the whiteboard, then bullshit jobs are here to stay

The thing that really bears consideration, though, is that the TECH that GPT represents can absolutely be honed and modified to excel at certain tasks, and for law in particular the training corpus is extensive and easily available.

And so you get actual specialized models that DO give good citations, having been trained on the caselaw, and are trained to format it correctly.

See:

https://casetext.com/

Demo here:

https://youtube.com/watch?v=ZKLkmdK_Odw

Note that it lets you upload your own documents for analysis, which is surely a killer application since it would save you the trouble of having to copy/paste all the facts and the exam questions in manually.

So at a bare minimum, we see what these things are capable of in principle.

So my guess is that GPT5 is able to do everything that CoCounsel does, and at a higher level, and at that point it WILL be scoring near the very top on virtually any law school exam you could throw at it.

It gives me an aneurysm when people simply claim to be talking about ChatGPT without specifying if it's the GPT-3.5 or 4 model.

That is an important distinction, the models are akin to a smart and well read high schooler versus a decent grad student in terms of intelligence and breadth.

4 is grossly superior, and is less prone to hallucination and bullshit, though not immune. On difficult questions, the difference between them is hard to overstate.

Further, I suspect that if you're copy-pasting 20 pages of legal text into, you're also exceeding the context window for either model, measured in tokens, with each token being roughly 3/4th of a word. There's a GPT-4 32K model that can handle 20 pages no problem, but I strongly doubt you somehow ended up using it, since it's gated behind API access. IIRC, 3.5 has a max 4k token context window, and the version of 4 used in ChatGPT Plus is the same.

The issue with going over the context window is that the model simply becomes blind to whatever came before. An input of A-Z is seen by a model as "Y, Z" if it has a token window of 2.

So, presuming that you're using the free ChatGPT 3.5, you're behind the curve, if you think it's already scarily capable, you've seen nothing yet.

IMO, lawyers are fucked. Clearly, indubitably fucked beyond redemption, you're better off joining the Writer's Guild in pre-emptively banning LLMs in law instead of trying to compete. You're not going to win that battle, and it'll be an utter rout by the time you're out of law school.

IMO, lawyers are fucked. Clearly, indubitably fucked beyond redemption, you're better off joining the Writer's Guild

A couple contrary points:

  1. at least in the U.S., the practice of law is an upper class guild. Anything that would threaten the prosperity of the guild too much will be reacted against in the same way the medieval guilds reacted against competition - i.e. harshly.

  2. lawyers are disproportinately overrepresented in politics, and existing law gives the profession the right to govern itself and set its own standards. So regulatory or legally-mandated self-licking ice-cream cones which are well behind the technological curve are quite possible.

  3. the portion of the bar which actually wields power (judges, politicians, senior lawfirm partners, etc.) are disproportionately old, small-c-conservative about process and procedure, and notoriously tech-averse. I can't find it right now but I recall a brief back-and-forth I had with @ymeskhout about the absolutely pitiful state of courtroom computerized document- and record-review technology. The long and short of it is, I would not expect the top echelons of the legal profession to be best placed to make good use of the tools AI offers.

  4. not all portions of the legal field would be replaced well by AI. Contract drafting? Perhaps. Brief-writing? Also perhaps. But lawyers are also depended upon to manage interpersonal relationships (i.e. with regulatory agencies), negotiate, and advise clients about imaginative strategies and as to potential long-term consequences of various actions. While AI may may lawyers better at these tasks, I don't think LLMs as we currently see them can fully replace people. Also, lawyers are called-upon for record-keeping and compliance matters - as AI is shortly going to drastically reduce the cost of document generation, there's going to be a LOT more for lawyers to keep track of (assisted, of course, by other AIs).

Just my two cents - I'm also not that smart, so I could well be very wrong. But I don't have it in me to resign myself to the glue-factory just yet.

You won't find me disagreeing about the fact that its professions with well-cartelized guilds that can close ranks to protect their own that have a short term advantage against automation. Having literal politicians feeling favorably towards you can't hurt either.

It's for similar reasons that I think that US doctors have a better shot of doing the same than their meeker UK and Indian counterparts. After all, they already maintain their very high salaries by keeping us poor bastards away unless we jump through a great number of arbitrary loops. Can't blame them though, if that's the cost of making $250k median, I'd do it too.

More importantly, AI is absolutely a field in which you can't afford to simply plan ahead after looking at the current state of the field, GPT-5 and above will likely be just as good at atany cognitive task you throw at them, including planning and selling clients on speculative ideas.

At any rate, even if the top 1% of the profession clings on a little longer, it's no consolation to a poor bastard still in law school, he's never to going to make it to those stratified ranks before they pull the ladder up at escape velocity!

I would think there would be a decrease in cost for legal services, since everything could be done more efficiently on the part of those who are lawyers, and so the supply curve would shift. At the same time, I'm not sure if this will result in a wage decrease on average, because the lawyers that exist should be able to do more, which might compensate.

I think the decrease in cost would increase the level of litigation that goes on, which would increase demands for lawyers in courts and for judges, which cannot be as easily automated. I'm not sure what the net effect of demand for lawyers would be.

I'm sure people with more experience in economics would have a more detailed perspective.

I think the real issue will be severe downward pressure on wages, particularly among newly minted attorneys who are looking for jobs in-house or large firms.

That is, for most pursuits and purposes, GPT can do just about anything a first-year associate can do, and cheaper. Your 'bargaining position' as a new attorney is basically "if you spend tens of thousands of dollars and hundreds of hours training me I MIGHT be able to produce work on par with this $500/month software subscription."

So law firms are likely to cut the budgets they would otherwise use to hire rafts of recently-graduated associates at extremely high salaries.

This will place a lot of pressure on law schools to bring their prices down, and it isn't clear they can do that without disrupting a lot of other aspects of academics. That is, many students go to undergrad SOLELY for the purpose of getting into Law School later. Law school is an 'artificially' capped field, and there are many other expenses (books, tutoring, test prep... adderall) that students will spend money on to get ahead.

Harder and harder to justify that if they can't expect to make at least middle class wages right out of school.

Many mid-late career attorneys will be able to get by on the strength of their reputation alone. Even if LawGPT is technically just as smart as they are, and just as if not more likely to return a 'correct' answer, a guy with 20 or so years of experience under his belt will command a certain gravitas and thus can render advice, opinions, and judgment that is accepted as authoritative on his say-so.

He probably won't feel the pinch of competition so much. Although I'm fully prepared to be wrong about that if the superiority of the AI is just that dominant.

lawyers are fucked

I'm not a lawyer and I have little understanding of what they actually spend most of their time doing, but I did serve on the jury for a criminal trial one time.

One of the defendant's attorneys was a smokin' hot young blonde. Plausibly, she biased some of the jurors at least a little bit in favor of her client, simply through her presence. Someone who rolls up with nothing but a laptop and ChatGPT is going to be at a relative disadvantage, if your goal is to persuade other humans. Even if you can generate a convincing deepfake, that's always going to lose out to the real thing who's actually in the room with you.

I'm aware that many lawyers will go their whole careers without ever setting foot in a courtroom, and not all lawyers can be blessed with the advantages of being young, attractive, and female. But this is just one example of how there can be factors in a job that can't be reduced to raw text input-output.

There's also the issue of liability. Perhaps this isn't quite as important for law as it is for medicine, since law is forced to be much more tolerant of risk and failure than medicine is, but I think this will still be an issue. If I ask ChatGPT to write the contract for my multi-billion dollar merger and the contract has a major loophole or error in it, who do I blame? OpenAI? Will OpenAI just tell me "sorry, all sales are final, use at your own risk"? I'm not sure if that's going to fly - people will still be reviewing the contracts if only because there needs to be someone to blame if things go wrong.

It gives me an aneurysm when people simply claim to be talking about ChatGPT without specifying if it's the GPT-3.5 or 4 model.

Yup, my bad, I was using the free version of 3.5, so the 4 model I'm sure you're correct will be vastly better.

Further, I suspect that if you're copy-pasting 20 pages of legal text into, you're also exceeding the context window for either model, measured in tokens, with each token being roughly 3/4th of a word. There's a GPT-4 32K model that can handle 20 pages no problem, but I strongly doubt you somehow ended up using it, since it's gated behind API access. IIRC, 3.5 has a max 4k token context window, and the person of 4 used in ChatGPT Plus is the same.

Possible, but I think I was under the context window as it had no trouble applying stuff from the early sections of the dump in context with the new information I fed it.

IMO, lawyers are fucked. Clearly, indubitably fucked beyond redemption, you're better off joining the Writer's Guild in pre-emptively banning LLMs in law instead of trying to compete. You're not going to win that battle, and it'll be an utter rout by the time you're out of law school.

Depends on what law is being practiced. I'm personally uninterested in the traditional law firm route, there is nothing that sounds more utterly boring to me than spending (to quote an excellent song) four years working on a pharmaceutical company's merger with another pharmaceutical company. Cue hysterical laughter from the attorneys reading this who all know at least a dozen "I'm gonna work in public interest!" law students who took a job at a BigLaw firm. I've secured an internship this summer working in a public interest field, and if I do well there's a fairly good chance of my receiving a job offer on graduation conditional on passing the bar.

While a lot of legal work can, and will, be automated without requiring input from anyone other than the attorney submitting the documents, the biggest losers are going to be paralegals and secretaries. The legal field is incredibly conservative in a lot of ways. Hell, my state's court system didn't even have a mandatory e-filing system until 2014, and it still isn't technically fully rolled out. State bar associations are slow to change things, and I would not be surprised in the slightest if legal-AI ends up in the middle of a massive court battle that will drag on for literal years. So long term, maybe. Short-medium term? I doubt it.

for the record, my boss played around with 4.0 and over-optimistically reported on Monday that it had come up with three amazing cases with holdings that were perfect for a very tricky matter we're working on. When I and another associate actually checked the cites, two of the cases were complete fabrications, and the third cite was real but the AI completely fabricated the facts and holding.

I tried playing around with 4.0 today, and while I could basically get it to spew generally-correct answers regarding broad questions (i.e. "can law enforcement use under-age decoys in sting operations at licensed alcohol retailers?"), it does horribly when asked to summarize specific regulations or cases even when given the correct citations. Casetext is better, but we're not quite there yet.

I've talked about this briefly as GTP's reliability problem. If I have to keep massaging GTP to correct the bullshit I know about, then it isn't doing anything to help with the bullshit I don't know about and I might as well write the damn thing myself, because at the very least I won't make up the law in the absence of any on-point citations. I'm also curious whether GTP is able to draw parallels among cases that aren't directly related. A lot of the time an issue is fairly novel and there's nothing directly on-point so you'll have to go to whatever is vaguely similar and argue from that; I haven't seen any evidence thus far that GTP is capable of making those connections, even though they're crucial in appellate practice. If a lawyer tries to use GTP to save time and the first result is garbage like you got, the firm isn't going to trail their lawyers to massage the inputs until they get something coherent, it's going to tell them all the problems with it and bar its use. At some point there will be a malpractice suit where the lawyer is forced to pay an award because he relied on GTP and fucked up. And once that happens it will take a very long time before anyone in the legal profession trusts it, regardless of how good it is. Hell, a firm I was at didn't want us to use OCR because it had a history of missing stuff that wasn't printed perfectly.

Firstly, can OP clarify if he's talking about 4 or 3.5? Secondly, look at what people are already doing in the field of law and earning discounts: https://twitter.com/jbrowder1/status/1652387444904583169

GPT-4 is, in my view, Generally Intelligent. I can ask it to submit a corporatese style job application for the hordes of Genghis Khan, I can have it dream up sequels to video games, it can manage moderately complex programming tasks and bugfix them, it can emulate the format of Grand Designs, write half-decent parts from the Simpsons... Aside from censorship and limitations with how much memory it can store in a conversation, is this not general intelligence?

Certainly there's confusion as to the definition of AGI. GPT-4 doesn't meet the qualification 'perform most human functions as good as or better than humans' because it can't draw or use a mouse, amongst other things. But in terms of matching human intelligence, it is basically there.

Human intelligence is not all it's cracked up to be. Consider that about half of the UK Parliament can't answer a basic probability question. GPT-4 has no problem with this, nor does GPT-3.5.

It asked the MPs what the probability is of getting two heads if you flip a fair coin twice.

Only 52% of those surveyed gave the correct answer of 25%. A third (33%) said the answer was 50%, while 10% didn’t know. The rest gave other answers.

Firstly, can OP clarify if he's talking about 4 or 3.5?

Apologies, I wrote this fairly early in the morning and forgot to include this information, I was using the free 3.5v.

GPT-4 can draw (albeit not well) if asked to output SVG or TikZ or some other human-readable graphics format.

I've tried a set of qualitative math/engineering questions on LLMs. The Bard and GPT-3.5 answers were about what I'd expect from an undergraduate starting to study the field: roughly 25% of the answers were true-and-useful, 50% true-but-not-useful (not a bad thing, just statements that were adding context to the answer rather than being the answer), 25% not-true (though even these were oversimplifications and natural misunderstandings, not utter hallucinations). If a human had given those answers I'd have considered them a kid worth mentoring but I wouldn't have expected them to save me any work in the near future.

The GPT-4 answers were better than I'd expect from a typical grad student who had just passed an intro class on the subject. Adequate depth, more breadth, and this time the statements weren't 25/50/25, they were about 75/25/0. I passed my questions to a friend's brother who had a subscription, but now I'm tempted to subscribe myself, give the thing my whole final exam, and see how it does on the quantitative + symbolic questions.

Chat GPT-4 gets 88th percentile on the LSAT vs 40th for Chat GPT-3

If you are using the free version it's the equivalent of a D student, pay $20 to get an A student.

Your professor used to be right when he said the cutting edge AI model wasn't that good for law, but AI moves so fast he is now wrong

OP is talking about a law school exam, not the LSAT, which is very, very different.

They are both used in the real world to evaluate legal ability (or potential) so such a drastic gain in ability between versions is worth knowing. I think it's almost certain he'd see improvement in his task using the latest version

A law school exam is also very very different from the bar exam, just FYI.

Depends which law school you're going to, and what class you're taking. Bar-subject classes at middle-ranked law schools tend to hew pretty closely to bar-exam-style questions because they're not trying to train the next SCOTUS nominee; they're trying to make sure their bar-passage rate is high enough to attract more applicants.

You really need a bit of familiarity and prompt wrangling skill to get the most out of GPT. I've seen plenty of people testing it and fucking up basic stuff which makes them conclude that it's overhyped.

A common issue is not pressing the "GPT-4" button in the ChatGPT interface, and getting GPT-3.5 when you think you're running 4; and not being familiar enough with the interface to recognize that the blue-green icon means it's a lot dumber. E.g. Bryan Caplan fell for that.

A second issue is that GPT is at it's core a predictor, not a simulator. If you ask it something and it messes up the response, asking it something else in the same chat context means it will now start predicting that it's a character that makes mistakes. Resetting the chat fixes that. In the same vein, if you are giving it a quiz, having it predict a student response is suboptimal. Students make mistakes, you want it to predict the answer key.

Also, giving it few-shot examples improves it a great deal. If you have an example of a perfect query+response to a similar question, stuff it in the prompt itself! Then the LLM will know what kind of response you are looking for.

A common issue is not pressing the "GPT-4" button in the ChatGPT interface, and getting GPT-3.5 when you think you're running 4; and not being familiar enough with the interface to recognize that the blue-green icon means it's a lot dumber. E.g. Bryan Caplan fell for that.

Which blue-green icon are you referring to?

In the corner of the ChatGPT replies there's a little profile picture; if it's a blueish-green it's running a 3.X model; black means it's 4.

You might need to click on the image on the Caplan Twitter post to view the entire thing, it gets cut off in the preview window for me.

As a programmer I think of ChatGPT as a higher level language. In 1950s programmers wrote ones and zeros. By 1970 much of that was done by computer and the programmer just had to write save something to memory and it was done. By the 2000s a programmer just needs to write fetch this thing from this database on a server and the computer does it.

Natural language is the low level language of law. It is the language in which details are written in. AI will allow lawyers to write a high level language. A 10 sentence description will be able to yield a 10-page contract. A modern programmer is in many ways akin to a tech lead 30-40 years ago managing digital tools doing the actual work. A future lawyer will be closer to a senior lawyer managing a group of digital juniors who fight with the wording of projects.

As this transition comes about, it in many ways becomes harder for juniors as a bigger part of the job gets shifted toward management, understanding requirements and leading a project. A person with business experience and mediocre programming skills may very well be a better web dev consultant than a code challenge pro.

On the flip side the price has gone down and that increases demand. Most likely, people will use lawyers in cases that are currently too expensive. People will want custom contracts for smaller deals, poor people will want wills etc.. I don't think the legal profession will die, if anything it will explode as the output rises. If a job that today takes 10 hours can be done in one the price will be 10%. That could very well 20 fold demand.

Working with consistency, automated testing, finding the edge cases that the AI has trouble with and gathering requirements from customers will be a bigger part of the job.

"Bugs" in legal contracts can be pretty high stakes though -- I'm not sure lawyers will want to make the shift to signing off on AI generated contracts without extensive review. (which AIUI is kind of like reading legacy code in that it's actually more work than writing the damn thing yourself in many cases)

On the flip side I think "GPT-as-a-linter-for-contracts" might be a killer application. Take a set of "things that are frequently bad in contracts", for each item in the list and each clause of the contract ask if that clause of the contract has that particular issue.

If a contract passes the linting step that doesn't necessarily mean that it's fine, but if it fails the linting step that's at the very least a sign that a human should be looking at that particular part more closesly.

"Nowadays, I do most of my programming using an extremely-high-level language known as 'Undergrad'"

  • Some computer science professor, when asked about how he creates programs.

That programming language sounds like a nightmare to program in.