Measuring Player Skill Follow Up Experiment

As a follow up to the discussion in the thread about measuring the skill of the player, rather than RNG or wallet size, @JonahTheBard and I decided to do a little experiment. We tried to control as many variables as possible. Here are the parameters.
@IvyTheTerrible, @Math4lyfe, @Rigs, y’all had expressed interest in the results. If any of you want to see the raw data, I can send you the screenshots from each run, or the excel spreadsheet, on Discord or Line.

  1. Create a 3* team and keep it the same throughout the experiment. No leveling up or adding emblems or costumes.
  2. Run that 3* team through Pirates Rare tier, level 9, on autoplay, ten runs. Take screenshots of the scores after each run. That is the first level with 3 bosses (Peters, Vodnik, Boomer). Average those ten scores to get a baseline for the AI. Remove any bonus points for the chests before averaging.
  3. Run the same level manually ten times with the same team, with no item use. Screenshot those scores and average them, again removing the bonus points for chest enemies.
  4. Compare the manual score average with the AI score average with the same team.

I did this experiment twice, once with a mono yellow team consisting of the following heroes:

And again with a 2-1-1-1 team (2 yellow, no purple) as follows:

Results in next post

6 Likes

For the mono yellow team, autoplay had an average score of 30018.2. Of note, autoplay only completed the level 5 times out of 10. I did not continue with gems.

Subcategory scores as follows:
Average score: 30018.2
Enemies killed: 9743.7
Time bonus: 6328.3
Match bonus: 11473.4
Health bonus: 2472.8

For mono played manually, we have:
Average score: 49054.7
Enemies killed: 12292
Time bonus: 13378
Match bonus: 16365.1
Health bonus 7019.6

This team was hindered by the lack of a healer, but it was hindered equally.

6 Likes

For the 2-1-1-1 autoplay team, which did have a healer, but still was defeated once, we have:

Average score: 39549.4
Heroes killed: 11800.3
Time bonus: 10652.9
Match bonus: 12818.6
Health bonus: 4277.6

And for the 2-1-1-1 team played manually, we have:
Average score: 51265
Enemies killed: 12292
Time bonus: 13787.7
Match bonus: 17014
Health bonus: 8171.3

5 Likes

Conclusions from my admittedly limited dataset:

  1. The AI is definitely not suited to playing mono.
  2. The highest average score came from playing manually with 2-1-1-1. However, my best individual run came while playing mono manually.
  3. The autoplay is notably slower than I am regardless of which team. This difference is much more pronounced with mono.
4 Likes

Very interesting experiment design.

I’m afraid of the results if I do this. I use Autoplay on Rare regularly to grind for xp/feeders on 3. It regularly increases my max score, which I did only get after 1 play typically, but I don’t think that bodes well for my results.

I believe I would find that I’m average or worse if I did this.

3 Likes

Actually this is what it seems like to me as well just by lookin at the numbers

Excellent job @NPNKY… hopefully I’ll have enough data to post soon.

So, dividing your average by the ai average, it gives you a score of 1.296.

I will see if I can beat you :grin:

I’m sure there are numerous limitations to the experiment, but it seems like a fun and achievable way to measure ‘skill’ to some extent.

4 Likes

Was that for mono or 2-1-1-1?

I think this gives a ‘smoother’ result as mono can hinge on a few lucky matches

1 Like

It was a fun little experiment. I mat try something similar with the next event.

Side note…Pixie is almost like a cheat code for this. If my saved coins give me a second copy next time around, I’ll definitely be leveling her again.

And the beekeeper is just plain fun. I love the little bee minions. Death by 1000 paper cuts

2 Likes

First thoughts… played 5 levels on auto.

Mono team - maxed 2 pixie, 2 melia, 1 kvasir.
Died 3 times on abysmal boards and won twice.
Kinda losing steam on the idea as it’s wasting WE on failed attempts.
I may come back to this later with a different team. Too bad no 3* blue or yellow healer.

2 Likes

The time bonus is aproblem because AI waits for the special animations end then proceed to the next SS than wait etc…
Maybe removing time bonus will give another point of view…

But we know AI is not really very intelligent :sweat_smile:

I’m not sure because it affects us all the same…so my score divided by the AI score is affected in roughly the same way as your score divided by your ai score.

It’s designed to give us a way of comparing each others ability with the AI score being used to balance out roster and the multiple runs balancing board rng

3 Likes

But with any board on level 9, I would never lose. losing via AI idiocy drags down the score and feels like it would be a false inflation of skill. Not trying to be difficult, it’s always fun to see how someone measures up. maybe the second part of the test would be good, but comparing to AI I’m not so sure.

2 Likes

I don’t think mono lends itself well to the test. I’m running belith and c.hawkmoon in a 2-1-1-1

1 Like

I agree. The AI is terrible at mono. It requires a different way of approaching the board, where cascades are not your friend. It might be worth looking at other stacks too, and see what the variance is. Maybe next event I’ll try it with a 3-2 and a 3-1-1

1 Like

Probably accurate. But if it were a legit test of my gameplay, I would play for speed (mono). If I play to accommodate the AI, I would likely do something similar and add in a couple healers.

Either way, the model would need to be adjusted somehow because (as mentioned) the AI waits to cast after every animation, is generally slower on tile movement and has this infuriating way of targeting the worst possible heroes.

So to tweak this, then we would need to set some guidelines.

1- We would have to all use the exact same heroes - could be unfair if, say, people don’t have pixie and kvasir. (kvasir slows it down, but is still so much fun to use).
2- We’d have to figure out the time calculation issue with AI
3- What about the targeting issue? That’s a vital part of the game that can affect the scores on AI greatly.

just typing some thoughts.

1 Like

That’s why we’re using the AI as a baseline. It is equally slow and stupid for everyone. But if one person gets 25% higher average scores than the AI using the same team, and another person gets 10% lower average scores than AI using the same team, that would provide us with a meaningful yardstick to measure one player against another (on as close to a level playing field as we can achieve within the game).

1 Like

But wouldn’t it better if we use average each other as the baseline? Just each of us run 10 times, post results, average them together and score it like that?

1 Like