ControlsFreak
No bio...
User ID: 1422
Level, yes. You can actually do an okay-ish enough job by just dialing down some numbers. The great thing about the ELO system is that you can just use results to figure out how to 'rate' a bot. Just create a gimped bot, let it play a bunch of people online, see what rating range it ends up in, and poof, you've got a bot that plays "at that level". Want it a bit higher/lower? Turn it up/down a smidge. You'd need a lot of top-level players to play along with the scheme to dial it in decently at the elite level, though. Your estimate of its level gets better as it plays more, so this is likely practical for someone like chess.com for very short games. It's harder for elite-level and longer games, just because it's going to be hard to get enough data.
Style? Not a chance. At least not right now. This is an area where a lot of folks are investing significant efforts. The hope is that rather than just using traditional engines, you can take gigantic databases of human moves, sprinkle in some ML magic, and get something that plays more like humans. My sense is that no one has been really successful in doing so yet. I haven't even heard any rumors of any top-level players finding someone who has managed to do this and then proceeding to use it. Maybe someone has, and they're keeping it top secret for a competitive edge, but if so, it's very secret, and I haven't heard a peep.
There is a huge debate about playing online or against engines, and this could certainly come up in a conversation about the controversy. This requires another digression into some background on chess.
One major perennial discussion in the chess world is about time controls. That is, how much time do players have to think about their moves? There are a lot of different time controls out there that people use, and I'm certainly not going to cover the entire debate here. The most prestigious chess events are still 'classical' time control. This already gets confusing, because there's not just one single time control for classical, but there is somewhat of a range of possibilities that people generally view as 'classical'. For example, right now, both the open/women's candidates tournaments are happening. The open is using a time control of two hours with no increment for the first 40 moves, then an additional 30 minutes plus a 30 second increment added to your time for every move you made thereafter. The women's section is instead using 90 minutes plus a 30 second increment from the beginning through move 40, then an additional 30 minutes (still with increment) thereafter. But both of these time controls are still generally considered 'classical'.
Time controls are enough of a controversy that it is often cited as one of the issues that has driven a wedge between Magnus Carlsen (viewed by many as still the best player in the world) and FIDE, with Magnus giving up his title of classical world champion, participating in fewer FIDE events, and possibly(?) building groundwork for a competing organization to challenge FIDE. But that's an aside. All to say, they're a big deal.
Second is online vs. 'over-the-board'. Online chess has enabled a lot of people to play casually or even competitively. However, the online community is generally very skewed toward much faster time controls. I haven't checked the stats, but my sense is that online, blitz (approx 3-5min) is the most popular, followed by rapid (approx 10-25min) and bullet (approx 1min) in some order. There just aren't that many good players playing classical online (or honestly, that many players in general). There is huge controversy as to what extent playing much faster chess translates to success in slower chess. You can try out more ideas more quickly; you can train your instinctual or short-calculation skills; it may genuinely improve your play if you get into a time crunch in a classical game. But you don't get the experience of really sinking into a position and calculating deeply. It has not been uncommon in the current candidates tournament for players to invest 30-60 minutes on a single move that they think is critical. Recognizing when to use that sort of time and using it effectively is a skill that you simply cannot build playing blitz. But we've also seen some players get really really good at blitz first, and then eventually transition into playing quite good slow chess, so it's still a pretty open question of controversy concerning the relationship between them.
Moreover, there is a perception that cheaters (using engines) are much more prevalent online than at over-the-board events. It's already hard to find many people playing classical online; but if a bunch of them just sort of give up and start using engines anyway, are you really getting much that you wouldn't be getting by playing an engine? In fact, some people think that it's even worse if you're worried that your opponent is cheating. It's harder to stay focused; you're more likely to waste clock cycles thinking about whether they're using an engine or not rather than on the game.
That, then, brings us to engines. Engines are super super good. Much better than any human. Sometimes there are ways that you can play anti-computer chess, but even that is pretty hard. And they do not "think like humans". Top players certainly use engines to help come up with opening ideas, or they'll use them to help them improve some of their calculations, but there's somewhat of a fine line between looking at something the engine says and thinking, "Oh, that makes sense; I can maybe try to incorporate that idea into my thinking in some way in the future," and, "No dawg; even if you gave me an additional hour in a game, I am either not going to come up with that idea or not going to be able to calculate enough of it accurately to ever feel comfortable trying such a thing." Engines can help, but it is hotly contested concerning how they can help, what level players can get what kind of help from them, etc. The current classical world champion famously was not allowed to use engines at all for the first however many years of his development (I don't remember the exact number).
How to effectively use engines to prepare for human tournaments is difficult. Aside from using them sometimes for tactics or other ideas, they're probably most used in "prep", where a person makes some plans ahead of time for what they want to do in the opening phases of the game. This is notoriously difficult, even at the highest level. In this very candidates tournament, one of the most well-known players (because he's also a streamer) got into a situation where he had played his computer-driven prep, and at one point, his opponent played a move that wasn't in his prep. His next decision was a critical one, but the position was quite complicated. He spent like an hour and then played the wrong move. After the game he blamed himself and his team for not looking at that move in preparation. He thought that the position was "impossible to play" as a human, and this is one of those pitfalls that make working with engines hard. You can't download everything from the engine into your brain; you have to stop somewhere. Where do you stop? You have to be a highly skilled player with a sense of, "This is a position that I can probably figure out over-the-board," versus, "This position is absolute madness, and so if I'm not able to just memorize what the engine says, I probably won't be able to figure it out, and I may end up just lost."
All of this is very controversial on its own, and I get no sense whatsoever that these controversies are being propped up in some way to support a position on women's chess.
Unsurprisingly, there was some controversy.
First, some organization. The primary international organization that is involved in many of the highest-level chess tournaments is FIDE. Since I know the most about US concerns and TheMotte is still somewhat heavily US, I'll also discuss the US Chess Federation, which operates, unsurprisingly, within the US. Some international events are directly organized by FIDE. Many times, FIDE will 'rate' an event run by some other organization. USCF also does their own rating system for their own events, but FIDE may rate them, too. That is, USCF hosts a variety of events, and some of them are not FIDE rated, while others are. What it means to be 'rated' is that the organization will take the results of the event and use them to update their list of ratings for players (which is meant to be a measure of how good they are). USCF may host an event that is not FIDE rated, and just your USCF rating will update. USCF may also host an event that is FIDE rated, and then both USCF and FIDE ratings change.
The perception of FIDE is that it has many institutional connections to Russia and similar countries. The Soviet Union used to be a powerhouse in international chess, and they still have a fair amount of pull in the organization. The current president of FIDE is Russian. This is not strictly determinative of what they will choose to do (for example, at least one top-level Russian player who was an outspoken supporter of the Ukraine war was banned, and other Russians have been playing "without a flag" since the war started), but FIDE does not necessarily lean in the direction of US politics. In 2023, FIDE enacted a policy on transgender players. It was controversial, and I'll just let chess.com describe the controversy. USCF, on the other hand, had enacted a much more permissive policy that simply accepted self-identification. My understanding is that if USCF runs an event that is FIDE rated, the FIDE rules control.
There is, of course, controversy, but I think there are at least a couple factors that make it less likely to come up as much. First is that the people who are most likely to be upset about it are in the US (or perhaps in other countries that have more US-levels of pro-trans, and perhaps their own national federations have taken similar stances to USCF), and there's very little point in complaining about/to USCF, since USCF has enacted their own, more permissive policy. They would have to complain about/to FIDE. And, well, everyone seems to have something to complain about with FIDE, so it's hard to have this one move very far up the list. FIDE is also viewed as being pretty hard to influence, and so especially with such significant Russia/Russia-like connections, many folks probably think that it would be basically shouting into the void. They're probably not going to change FIDE's mind. The best chance would be to prop up a competing organization, and if that's going to happen, it's probably going to be primarily because of other grievances, so a pro-trans person may just not bother emphasizing the trans thing and just latch on to other criticisms/reasons to split, but holding hopes that if such an effort is successful, maybe there's a chance that whatever replacement organization would be more likely to be more pro-trans.
The second factor is, frankly, the vibe shift, where it seems like trans stuff has just been getting less sway in general. It's not that there's no controversy, just that it doesn't seem to be getting quite as much attention.
I'm not really aware of any high-rated male players transitioning and then winning some or a bunch of highly-respected women's events.
women's chess leagues (in my observation, uncontroversial)
Oh boy, you haven't seen the controversy?
My sense of the story is basically this. The big question hanging over the game is, "Why do we not see more women at the very top tier of chess?" Interestingly, my sense has been that almost everyone involved in this conversation is actually perfectly fine in saying that physical sports are different (this does not hold for the general population, but it seems to hold within the community of folks who are involved in the conversation about top-level chess), but obviously, the question still lingers for what is mostly understood as an almost entirely cognitive game. "Sure, it makes sense that you're going to have differences in powerlifting, but what is the nature of the underlying cause of the observed difference in chess?"
Of course, folks consider the possible counter-examples, like Judit Polgar, who peaked at #8 in the world, and there is usually some debate as to whether she stands for the proposition that it is generally possible to have a higher proportion of women at the very top or whether she is just an outlier among outliers, the Bobby Fisher of women's chess, but that even the Bobby Fisher of women's chess couldn't reach the very top of the men. There is obviously no clear answer here, but it is a question that is absolutely discussed every time the controversy appears.
The immediate domain of the actual controversy is the question, "Should there be women-only events/sections?" On the one hand, if there aren't women-only events/sections, then folks worry that women will be mostly shut out of top-tier competitions (not due to active discrimination, but because their rating levels simply won't qualify them for many qualification criteria for open sections of closed events.
Aside, because I just said "open sections of closed events", and that sounds weird. Often times, the word "open" is used in two senses. In the first sense, it is that the tournament is generally open to all participants. This would be like "the US Open", where anyone, regardless of chess level or achievement, can just decide to sign up and play. This is contrasted to "closed" or "invitational" tournaments, which generally have a set of qualification criteria or are perhaps even more directly just a set of hand-picked competitors that are invited to play. The second sense is that many tournaments, whether "open" or "closed" will often have a women's section. However, like many other sports, they won't have a "men's" section; they'll have "women's" and "open", so that a female competitor can choose to play the open section if she'd like (and for cases with qualification criteria, if she qualifies), but is also able to choose to play the women's section.
If an event has a women's section, there's usually a side question about whether the prize funds in the women's section is as high as for the open, but that's usually a side question for the main question. Generally, one of the arguments for having a women's section is that if an event doesn't have a women's section, then with the current ratings of the populations of men/women at the highest level, most women would not likely be able to compete for a significant amount of prize money... or they'll be shut out entirely from closed/invitational tournaments because they're less likely to meet the qualification criteria. With less chance of winning substantial prize money, it would be more difficult for them to continue making a career out of chess, and the thought goes that not having women's only sections, with separate prize pools, will cause more women to leave competitive chess, which would either make the disparity worse or at least sort of lock in the disparity.
On the other hand, there's a competing theory concerning the reason why there aren't more women at the highest levels. The theory goes like this. In order to improve at chess, even all the way to the top level, you must have a lot of experience playing against top-tier competition. It is by having these experiences, seeing how they best attack your weaknesses, and learning from it, that you improve your own game. If you don't keep going at it against the best, you're less likely to figure out how you can be better. On this view, having women-only events/sessions might seem like a good idea in that it makes it more likely that they'll be able to partake in a prize pool, but you run the risk of 'quarantining' the women. Perhaps someone is currently good enough to consistently win good prizes in women's sections, but if she started playing open sections, she'd likely struggle. One might argue that she should, nevertheless, play open sections in the interest of her long-term improvement, gaining the experience of playing against the absolute best players. However, her direct financial incentive is to play in the women's section, where she's more likely to make more money now. This has been argued by some number of top female players, and a few have even eschewed women's events entirely, choosing to play only opens. So far, none have made it to the top top with the men, but a few are at least putting their money where their mouth is, in a sense, and playing only open sections in accordance with this line of thinking.
So, the controversy is a bit of a trap. Both having women's sections and not having women's sections can be 'problematic', depending on how you view it. The debate is still unsettled in the community for whether there is an underlying theory that can explain the current disparity or whether that disparity can, in theory, be 'fixed'. It's not the most public controversy of controversies, but within the small community of top-level chess, it's absolutely a controversy.
In advance, you just don't. I mean, you eventually will, when the bill comes, but before that, it looks like our civilization is not advanced enough to find an answer to this question. That's one of the infuriating things in in US medical system - everything is set up to make it nearly impossible to state the cost upfront, or at least everybody involved in the system has been telling me so for years. Of course, this has a great benefit (for the providers involved) of precluding any price comparisons.
I suppose I should just say it. I know you implied it, but someone should just say it directly. This thing that everybody involved in the system has been telling us for years... that the system is not advanced enough to find an answer to this question... is a lie. The people involved have the numbers that are required. They can just give those numbers to patients. When this is pointed out, they will lie and misdirect and do everything they can to throw up fake and imagined roadblocks to this very simple reality, to the point of playing dumb/lying about whether they are even capable of identifying the names of the numbers in question. It is the great shame of the medical industry that they have harmed so many patients by their addiction to price opacity. I've pointed before at pieces like this where they talk of patients making choices to not get care because of price opacity or situations where because talking about prices is verboten, the doctor might prescribe an expensive drug that the patient won't buy, but could have prescribed a cheaper, almost as good, drug that the patient could actually afford and would buy. I don't know how one would even estimate the number of times that people simply suffer through problems rather than seek medical help because of price opacity. They feel like if they even consider seeking medical help, they will never have any further chance to consider the cost involved. The perception is that if they do it, they're basically spinning the roulette wheel and then will learn after the fact, after services have been rendered, whether they will owe $10 or $10k. It's unsurprising that many reasonable decisionmaking-under-uncertainty-and-budget-constraint algorithms just opt out of that game of roulette.
trained females vs average untrained male ... I don't know how exactly they determine what intermediate even means
This is what I tried to basically give a conjectured definition of:
I'm thinking of a plot where the x-axis is something like "Number of months of training for the male" and the y-axis is "number of months training for the female". Each data point would be a point at which they are roughly equal in performance.
You also use the phrase "decently trained women". Like, what is that? This is what I'm getting at.
I really don't know, but based off that, the strength benchmark sites and the Reddit comments in that one thread I linked, it seems like it's just that your guy is actually just very high if he can bench over his body weight.
I think you're going off extremely few data points and seriously underestimating how effective a moderate amount of training and technique can accomplish.
women without exogenous testosterone
Yeah, the third caveat that I didn't mention in my comment above is chemicals. Both male/female world-record type stuff is obviously contaminated by all sorts of gear.
There's certainly a couple big caveats in here. First is how long you've been training. The original comment said stuff like "a couple years of training". There's obviously going to be a significant effect of how long they've been doing it for how far along the progression toward "elite" they will be. You're grabbing stats from record-setting women. It would be much more nuanced to take, say, some sort of typical progression after 2-5 years.
The second big caveat is body weight. A quick look at April Mathis on OpenPowerlifting puts her around 250-260. I doubt ChatGPT is really considering this. I happen to be freshly training a newbie right now. He's male, about 160lbs. It's been a couple of months (I just checked my records, and it's been eight bench sessions). I haven't done any 1RM training with him, just very slowly progressing on a beginner program. He's shown absolutely no sign of plateauing; I'm certainly not pushing him to progress maximally quickly; we're just taking it slow and steady. With what he's already done, my estimate of his equivalent 1RM would be about 165lb. So, I'm pretty confident the ChatGPT estimate is quite low. I'm sure ChatGPT's estimate would be even worse if the guy had a bigger frame and body weight.
It would certainly be interesting to consider body-weight-equivalent trajectories for men/women. I'm 100% confident that if both were completely untrained, the male would be able to bench more than the female. I'm also 100% confident that for two elite, been training specifically for powerlifting for a decade or two, lifters, the male would be able to bench more than the female. Interesting questions would be things like, "About how many years of training does it take for the body-weight-equivalent female to surpass the completely untrained male?" Also, how do those trajectories progress? I'm thinking of a plot where the x-axis is something like "Number of months of training for the male" and the y-axis is "number of months training for the female". Each data point would be a point at which they are roughly equal in performance. My guess is that the second derivative of such a plot would be positive (that is, each additional increment of training for a male would require even more increasing increments of training for a female). And obviously, the plot would just tap out at some point, because even non-world-class male lifters will be able to surpass the female world records.
- Prev
- Next

I'm going to add another Hikaru video, one just released today, because he went on a mini rant about this exact topic. It starts here with another GM asking him about a possible computer line from the game he had just played (he had a smile on his face, because he knows). The mini rant ends with, "This is the problem with these games, it's that, it's like whenever you use a computer program, you really lose the human feel, because some of these moves are just not human, or they just don't feel right. They just don't feel right at all."
How to effectively use engines to improve your ability to play human over-the-board games is genuinely hard/controversial.
More options
Context Copy link