Inspiration

Mini Martial Artists - Part 2

By BillyDataHayes Aug. 30, 2025, 1 a.m. Comments

TL;DR – I created a structure that uses statistics from the UFC to simulate fights between my Mini Martial Artists!

Setting Up the Problem

Alright so I haven’t returned to sleeping under a bridge for the past few months – I’ve been working on something far more counter-productive! In my quest to beat the system of sports betting I decided to focus on the different bets available in UFC fights. The reason for choosing the UFC was simple – two fighters enter and one fighter leaves. I also figured it was a sport that would have noisy data (making it much more difficult to predict the outcome of a fight) and therefore it would be difficult for bookies to set accurate lines. It also had the advantage of pitting individual opponents against one another, meaning I wouldn’t have to worry about balancing a team’s stats. The only stats that should matter in a fight are those of the individual fighter.

I came across a potentially useful website that included UFC fighters and their fight history. Not only does it contain information about who they fought and at what time, it had round-level data about how many times a fighter was taken down, the number of times they attempted a submission, the amount of time they spent in ground control, and even the number of times they were hit in the head! I was able to… acquire the data… and then parse and clean it to build a table that consists of roughly ~7,500 fights dating back to 1994 with UFC 2: No Way Out (a crazy time for the sport when any fighter could fight any opponent regardless of weight class). Due to the volume of data available on the site, I was able to collect ~430 different features on fighters which included things like the number of significant strikes to the head, body, and leg, the method of victory, and the total time of the fight.

While I was initially using this data to predict who would win a given fight (moneyline bets) and the total time a fight might take (totals bets), I had another idea to link this dataset to a project about some tiny fighters I worked on a while back. I got to wondering if there was some model I could make based on the available data that might make an interesting fight simulator. Then, I could assign one of these fighters the same stats as a real UFC fighter (a “Spirit Fighter”) and watch them brawl it out! If I could make an interesting enough system then it might be fun to play the role of the bookie and set lines for friends to gamble fake money on fake fights with fake mini martial artists (who fight with real heart).
I started out by drawing a sketch of the different positions an MMA (Mini Martial Arts) fighter might end up in and assigning a variable to each state transition:

It looks a little hectic now but don’t worry we’ll clean it up in just a moment. The main idea here is that there are 6 key states a fighter can end up in which include:

[distance, close, clinch, dominant_ground, unsecure_ground, vulnerable_ground]

The two fighters start in the distance position in the cage, which means close enough to strike with punches and/or kicks (variable a) but not close enough to do things like submit their opponent (variable j).

If one fighter wants to take another fighter to the ground to start grappling and/or attempting a submission then they can attempt a takedown with sequence (b -> d -> r -> j). If the attacker succeeds, they take the dominant ground position. Click the fighters below to watch a successful takedown attempt.

If the attacker fails, the fighters end in the clinch position. Click the fighters below to watch a failed takedown attempt.

Good defense! Of course this begs the question - how will we determine whether the attacker succeeds or fails in a given attempt? How many of their takedown attempts should miss and how many should fail? This is where our stats come in handy!

Many of the fighters have stats about situations where they’ve both attempted and defended takedowns that (with the right amount of patience) can be coaxed out of our dataset of fights. In the case of takedowns, we can tell that a takedown was attempted by simply looking at features such as:

[fighter_A_takedowns_total_landed, fighter_A_takedowns_total_attempted]

Which can help give us a sense for how frequently a fighter can take down their opponent (i.e how hopeless their opponent is to defend against them). An example of a legendary takedown artist is Georges St-Pierre who (as of the time of writing) has attempted 137 total takedowns throughout his career and landed 90 of them, giving him an ~65.69% takedown success rate. Getting a little trickier, we can also extract the fighter’s takedown defense by looking at how many times all their opponents attempted to take them down and how many of their opponents' takedowns landed based on the following stats:

[fighter_B_takedowns_total_landed, fighter_B_takedowns_total_attempted]

Kamaru Usman is a great example of a legendary takedown defender. Throughout his entire career his opponents have attempted 49 takedowns on him and only recently have 5 broken through his defense, giving him a respectable ~90% takedown defense.

Simulating Fights

So let’s ignore any weight divisions and say that a fighter with the stats of Georges St-Pierre attempts a takedown on a fighter like Kamaru Usman. Since Usman has the higher win rate, should he defend the takedown every time? Additionally, if there were a brand new fighter who attempted only one takedown in their career and succeeded should they be considered a better grappler than Georges St-Pierre?
I don’t see the fun in modeling my fights that way. Our fight mechanics should weight more established fighters heavier and less established fighters with higher uncertainty. Therefore I introduce to you… The beta distribution! (Ooooooooh).

Above is a beta distribution probability density function (PDF) which shows the likelihood of any given event succeeding or failing at a given rate. The x-axis represents the “true” probability of the event occurring for that fighter and the y-axis represents the confidence in any given point of probability. The “true” probability of success has to be somewhere within that interval of 0 to 1, and the higher the peak of the curve on the y-axis the more likely it is to be within that area on the x-axis.

The inputs of the beta distribution are simple – alphas and betas. Alphas represent the number of successful trials and betas represent the number of failed trials. All fighters will start with the initial prior of `alpha=1`, `beta=1` (or some other value that represents the typical UFC fighter takedown success / failure rate), and their official UFC statistics will then update that prior. This way, a fighter with no previous experience will have a 50/50 success rate against a similar novice opponent. Additionally, fighters who may have landed one takedown in one of their first fights don't automatically qualify for a 100% chance to take down more experienced opponents.

Let’s take a look at a graph that includes two competing beta distributions – one for Georges St-Pierre’s takedown offense and one for Kamaru Usman’s takedown defense:

The further to the right one fighter’s distribution is the more of an advantage they have in succeeding their takedown attempt. In this instance, we can see that Usman has a much stronger defensive advantage against a takedown due to his significantly higher rate of success, however the subtle overlap in these two distributions tells us that there is still a small chance of him being taken down. So how can we use these two distributions to settle whether Usman defends against a takedown or slips?

We can “roll” for each fighter by sampling one point from each of their distributions and allowing the higher value to be selected as the winner (this is a key concept I’m borrowing from A/B testing, and can also be seen in techniques like Thompson Sampling). Since Usman’s takedown defense is 90%, we’re most likely to roll the value of 0.9, however that range extends as low as ~0.55 and as high as ~0.999. The higher the peak on the y-axis in our chart, the more likely we are to sample a roll from that point. Click the chart below to try sampling a few points from Usman's takedown defense distribution.

Since St-Pierre’s takedown offense is 65.6%, we’re most likely to roll the value of 0.656, however that range extends as low as ~0.55 and as high as ~0.75. Click the chart below to sample a few points from St-Pierre's takedown offense distribution.

Therefore, in the rare event that St-Pierre rolls higher than Usman, the takedown will succeed! Try clicking on the chart below to attempt a few takedowns.

It's not looking good for the fighter with St-Pierre's stats, but keep in mind that these two fighters are outliers (waaaaaay more experience grappling than even the average world-class UFC fighter). Below your very own fighter is waiting for you to determine their fate! You can alter your fighter's past successes (and failures) by adjusting the slider to see how it affects their odds of landing a successful takedown. Watch how much easier (or more difficult) it becomes to land a takedown as you alter your fighter's takedown skills.

We can extend this to more than just takedown attempts. This technique can be applied to their probability of landing a strike, making a significant strike, knocking down their opponent, etc. Remember when I said I would clean up the chain of events that simulates a fight? Here we can see a digital version of my sketch:

Each state has its own restrictions on the actions a fighter can take and the next state they can transition into. I also ended up removing the “close” state and just decided to consider that a part of the clinch state. Each action’s probability has been tied to some combination of stats from our original reference and outlined in the following `actions` excel sheet:

I won’t bore you with too many of the details, but after processing this fight data and running it through a sequence of rolls for different state transitions, the output is a dataset that tells us what happened to each fighter in the match. Here is a sample from a fight between Gabriel Eglacier and an Oompa Loompa:

Future Improvements

There are of course tons of improvements I want to make to this model, including but not limited to the following:

Weighting success and failures by the skill of the opponent they were against
- A takedown on a fighter like Usman is much more impressive than a takedown on a less established fighter
Weighting priors
- By recency of fights
  - More recent fights should count as more important than less recent fights, both because fighters eventually get matched up with stronger opponents and because it gives a more accurate view of the current state of the fighter.
- By Strength of Opponents
  - In Covington vs Usman 2, Covington attempted 11 takedowns and didn’t land a single one. This contributes to ~20% of Usman’s takedown defense. Should these be weighted lower given that they were all made by the same opponent, or should the opponent’s lower stats still contribute to their success? Maybe this could be adjusted based on the understanding we had of the fighter pre-fight.
Actions should not be taken at random
- Currently there isn't any logic behind which action a fighter takes. Specifically, they aren't attempting to achieve any particular state and are instead following the path with the highest combined probability. It would be pretty funny to watch a UFC fight where the two fighters take their actions completely at random, however A fighter who is a better striker is more likely to want to stay on their feet. A fighter who is a better grappler is going to want to take the fight to the ground. Some kind of smarter decision-making framework will need to be implemented here so that the fighters can play to their strengths and strategically make their next moves instead of just taking whatever actions are available to them.
  - Maybe dynamic programming based on beta distribution differences? That could tell me the actual probability of an event occuring
Weight classes
- Maybe weight classes could be implemented for my fighters, but I also like the idea of a Spirit Fighter of a heavyweight losing to the Spirit Fighter of a flyweight.

Conclusion

All this is to say that I finally have a method that allows my fighters to fight one another. It's still a rough sketch of a model and there are plenty of improvements I can make here. This quickly becomes abstract and complicated at the individual script level. If only there was a place that I could showcase all of this information...

Tags:

Inspiration

Written by BillyDataHayes

I am a data scientist that likes to learn. In my spare time, I can usually be found working on projects that build on concepts that I study in math and statistics. I enjoy discovering anything and everything data related - from neural networks to the neurons of the people who implement them.

Previous Post Spending Waaaaaaay Too Much Time at Bars

DIMMiN