Tuesday, April 14, 2020

Analysis - Kokushi Success Rates

In this post, we'll look at how Houou players deal with different kokushi starts. Do they go for kokushi, or abort? Do they try and give up? If they go for it, how often do they get it? If they don't go for it, how often do they win? We'll look at these for the three main game modes, 4P Hanchan, 3P Hanchan, and 4P Tonpu.

There's a post on osamuko titled, "I told you not to go for Kokushi." (Archive Link) This post has been referenced a lot, but the data is a bit shaky. The dataset only includes "just over 9000 hands, each of which started with 9 or more distinct yaochuu tiles". That's a pretty small sample size. My dataset has 172,443 hands in 4P Hanchan with 9 or more distinct yaochuu tiles, so let's try to get some more refined data.

We'll start with the hanchan data. First, let's look at the success rate when players go for kokushi, to help contextualize the risk/reward. To define "go for kokushi," I defined abandoning kokushi as decreasing the number of unique terminals/honors in your hand in the first six discards (or until someone calls riichi or the game ends). Then, kokushi attempts are any hands where that didn't happen.
So, with 9 types, there's about a 3% chance. With 10 types, there's a 9% chance, nearly triple. And with 11 types, it goes up to around 25%, again nearly tripling. Having a pair doesn't increase your winrate that much, which is why I didn't count cutting from the pair as abandoning kokushi. Maybe cutting from the pair helps hide that you're going for it and that balances out? I don't know.

The Normal % column is the percent of non-kokushi attempts that won. It seems having a pair really helps in this regard.

Anyway, with that in mind, let's look at the players' decisions on whether to abort the hand or not.
Houou players will almost always abort a 9 tiles 9 types hand. That seems sensible. Even though kokushi is worth 32000, a 3% chance means the EV is only 960. The 10 types rates are kind of interesting. Having a pair raises the winrate by less than half a percent, and yet, the 10 tiles version is aborted 50% more often than the 11 tiles version. Going for it from ten uniques seems pretty justifiable.

Now, the 11 type aborts. With a 25% chance to get kokushi from an 11 tile start, you'd think it'd be really rare that you'd abort it. I was interested, so I gathered all the replays where an 11 unique abort happened. You can find the list here. In tight endgame positions I can understand, and the 16th where it broke a double riichi also makes sense, but I feel like you should certainly go for it with even scores... If someone wants to go through those replays and give reasoning for each or whether you think it's a mistake or not, that'd be cool to see!

Let's move onto the sanma numbers. In the "I told you not to go for kokushi" article, it has the line, "An advanced player might say that you can do it when you start with 11 different yaochuu tiles, but if you’re playing a 3-player game on Tenhou, you can do it with 8." I wonder what the winrate when going for kokushi from 8 is?
Yes, there was a 14 tiles start. You can find the replay here.

Going for kokushi with 8 types has a less than 2% winrate, so I'd recommend not going for it. Overall, the winrate for everything is higher. So, you'd think that'd mean they abort the hands less, right?
Nope, they abort the hands even more often than in Hanchan. Why might this be? Well, in sanma, the hands are all higher valued, so the EV comparisons are stricter. The average non-dealer hand in Hanchan is 5000, while in Sanma it's 7000. This means the hand you get when you abort will be better on average, and also, the hand your opponent wins if you fail will be worth more.

Additionally, the base winrate will be higher, since there's three players. In four player, you'd expect your average winrate to be around 22%, while in sanma it'd be closer to 30%. So, even though the winrates look better, it's not that different.

While we're here, let's look at the Tonpu numbers.
For the most part, the abort rates match up with Sanma, though Tonpu players are even more trigger happy with 11 types starts. Regardless of game mode, nobody has aborted a 12 tile start. For some reason, the winrate is higher when there isn't a pair, which is the opposite of the other game modes. I have no idea why this is, so I'll just say it's a data oddity.

For now, this is all I have, but I want to make a followup post later, looking at the rates based on seat or score position, as well as whether the number of honours influences the decision to abort. Is a 6 honour 4 terminal hand less likely to be aborted than a 6 terminal 4 honour hand? Well, maybe we'll see!

You can find the data from this post in this spreadsheet. The success tables go all the way, so it's interesting to see the win rate rise as the terminals/honours decrease. You can see the code used to gather this data on my GitHub.

2 comments:

  1. Hey, nice post! Really appreciate all the hard work you do to surface these kind of stats for the mahjong community.

    I did some probabilistic analysis on the likelihood of certain hand distributions for 9s9h and was getting some inconsistencies with this dataset. I think this inconsistency comes from the method that you grouped up your hand structures.

    Specifically in the hanchan data sheet, I was wondering: why is something like 11 tiles 9 types missing from the spreadsheet? This kind of hand structure is technically possible (7 single tiles + 2 pairs = 9 distinct tiles), but is noticeably missing. Furthermore, this type of hand structure should be much more likely than 12 tiles 12 distinct, so it's very unlikely that there simply weren't any samples with this particular hand shape.

    I get ~1/310 chance for a 9s9h, whereas your tenhou data shows ~1/308, so the totals are pretty similar. Maybe I'm just misinterpreting the data? If so, what hand structure would the 7 single + 2 pair category fall under?

    ReplyDelete
    Replies
    1. The code checked the number of unique tiles, then checked whether it had a pair. If it had a pair, it was logged as (X+1) Tiles (X) Types, and if it didn't, it would be logged as (X) Tiles (X) Types. So, an 11 tiles 9 types hand would show up as a 10 tiles 9 types hand.

      Separating them would be better for seeing the normal hand rate, but since this was focused on Kokushi success, I preferred combining the data to avoid thinness.

      Delete