Color Stacking Fairness Project

The recording adds about 30 seconds a raid, so it’s not that bad. Honestly, I burn through all my energy so quickly that stretching out my 6 raid flags by 3 minutes doesn’t seem so bad.

Your 3x blue 3s should be about a 1 in 2 million event. So yeah, that’s definitely one for the rarities trophy wall!

3 Likes

Just wanted to follow up with you on this hypothesis. I’m about 150 raids (about 5300 tiles) in now on my counts for enemies who color stack on defense. The results suggest that it’s pretty unlikely that there’s any bias to protect the enemy’s stacked color(s) either.

95% confidence bounds
Upper 21.67%
Mid 20.59%
Lower 19.51%

So I’m actually seeing about 7.2 tiles of the color that is strong against the enemy’s stacked color, on average. I don’t believe that it’s representative of a genuine bias in favor of hurting the enemy, though. Probably just a bit of statistical noise.

At this point, I’ve stopped rerolling with intent to find color stacking enemies, but I do still record data when they pop up.

8 Likes

Thank you for taking that extra step. I appreciate it :slight_smile:

Feel like this little guy can altogether be put to sleep now lol

4 Likes

Wow. You have so much patience. Well done.

2 Likes

Thanks! It adds about 30 seconds a raid, so it’s not too bad. I’m just about to post my 500 raid update. Still looking perfectly fair/unbiased :slight_smile:

5 Likes

Up to 17,500 tiles seen. That’s 500 raids.

Updated probability of seeing a tile of the stacked into color: 20.02%

95% confidence interval: 19.4% to 20.6%

Conclusion: looking perfectly fair

Data for raids #301-500 below

Team Red Green Blue Yellow Purple Tile Count Tiles 1st color
3 r 2 g 8 9 3 6 9 35 8
3 r 2 g 6 5 10 6 8 35 6
3 r 2 g 5 6 8 7 9 35 5
3 r 2 b 8 7 11 4 5 35 8
3 r 2 g 7 6 5 9 8 35 7
3 b 2 r 10 4 6 9 6 35 6
3 b 2 g 6 10 10 6 3 35 10
3 y 2 b 9 7 6 5 8 35 5
3 y 2 b 6 11 1 4 13 35 4
3 y 2 b 3 14 7 5 6 35 5
3 y 2 b 4 7 9 8 7 35 8
3 y 2 b 8 5 11 4 7 35 4
3 b 2 y 6 6 9 5 9 35 9
3 y 2 r 8 7 5 8 7 35 8
3 b 2 y 8 10 6 7 4 35 6
3 r 2 b 5 8 10 7 5 35 5
3 r 2 b 8 10 9 4 4 35 8
3 y 2 b 8 8 8 7 4 35 7
3 y 2 r 6 6 9 6 8 35 6
3 b 2 p 9 8 5 8 5 35 5
3 b 2 p 7 7 7 9 5 35 7
3 b 2 p 8 3 6 9 9 35 6
3 y 2 b 9 11 2 8 5 35 8
3 p 2 b 4 7 8 10 6 35 6
3 y 2 b 10 7 8 3 7 35 3
3 y 2 b 9 6 7 6 7 35 6
3 y 2 r 8 3 9 11 4 35 11
3 b 2 r 7 3 9 6 10 35 9
3 y 2 b 8 4 2 10 11 35 10
3 b 2 r 7 4 9 5 10 35 9
3 b 2 r 9 6 3 8 9 35 3
3 b 2 r 5 8 6 9 7 35 6
3 g 2 b 3 8 7 7 10 35 8
3 r 2 b 7 5 8 6 9 35 7
3 r 2 p 9 6 7 7 6 35 9
3 g 2 r 6 4 8 6 11 35 4
3 b 2 g 7 9 7 7 5 35 7
3 b 2 p 10 7 9 5 4 35 9
3 r 2 y 8 9 4 7 7 35 8
3 r 2 y 12 5 5 5 8 35 12
3 b 2 p 8 6 5 8 8 35 5
3 b 2 r 4 2 7 12 10 35 7
3 p 2 b 7 7 11 6 4 35 4
3 p 2 b 3 8 7 8 9 35 9
3 g 2 b 11 5 8 8 3 35 5
3 b 2 r 8 7 10 5 5 35 10
3 b 2 g 3 12 5 6 9 35 5
3 r 2 y 6 9 8 7 5 35 6
3 r 2 y 6 8 9 8 4 35 6
3 r 2 y 8 10 5 9 3 35 8
3 b 2 y 12 5 11 5 2 35 11
3 y 2 b 7 12 3 6 7 35 3
3 p 2 b 10 9 5 5 6 35 6
3 y 2 r 10 11 5 6 3 35 6
3 y 2 r 4 10 6 7 8 35 7
3 y 2 r 7 6 5 10 7 35 10
3 g 2 r 8 7 11 6 3 35 7
3 g 2 r 5 7 6 9 8 35 7
3 b 2 y 5 4 10 8 8 35 10
3 b 2 r 8 10 4 5 8 35 4
3 r 2 g 3 9 8 8 7 35 3
3 b 2 r 7 3 6 9 10 35 6
3 r 2 y 8 7 4 6 10 35 8
3 g 2 r 8 8 9 7 3 35 8
3 g 2 r 9 6 6 8 6 35 9
3 b 2 r 9 9 5 6 6 35 5
3 g 2 b 4 8 7 9 7 35 8
3 g 2 b 6 7 6 9 7 35 7
3 b 2 r 8 7 8 6 6 35 8
3 g 2 r 6 10 7 5 7 35 10
3 r 2 g 9 10 5 7 4 35 10
3 b 2 r 7 9 6 8 5 35 6
3 b 2 r 3 10 5 11 6 35 5
3 b 2 r 10 5 7 7 6 35 7
3 b 2 r 10 7 7 6 5 35 7
3 r 2 g 11 7 3 7 7 35 11
3 b 2 p 5 7 10 4 9 35 10
3 b 2 p 4 8 4 11 8 35 4
3 b 2 p 7 11 7 7 3 35 7
3 r 2 y 8 8 8 6 5 35 8
3 g 2 b 3 9 8 7 8 35 9
3 b 2 r 8 7 6 8 6 35 6
3 b 2 r 9 6 2 6 12 35 2
3 b 2 r 5 8 7 9 6 35 7
3 p 2 r 9 7 10 6 3 35 3
3 b 2 y 8 5 7 6 9 35 7
3 r 2 p 9 5 8 5 8 35 9
3 r 2 b 5 10 9 4 7 35 5
3 b 2 p 6 7 10 7 5 35 10
3 r 2 g 9 10 4 4 8 35 9
3 r 2 g 3 9 10 8 5 35 3
3 r 2 g 8 6 5 10 6 35 8
3 g 2 b 4 6 8 6 11 35 6
3 g 2 b 10 4 7 8 6 35 4
3 b 2 r 6 6 8 8 7 35 8
3 b 2 r 8 8 9 3 7 35 9
3 b 2 r 8 6 7 4 10 35 7
3 y 2 r 8 9 3 8 7 35 8
3 b 2 r 7 10 7 7 4 35 7
3 r 2 b 11 8 4 4 8 35 11
3 r 2 b 6 8 4 8 9 35 6
3 b 2 r 8 7 8 6 6 35 8
3 y 2 r 7 10 5 6 7 35 6
3 y 2 r 6 9 6 6 8 35 6
3 y 2 r 7 6 12 7 3 35 7
3 r 2 g 10 4 4 10 7 35 10
3 p 2 r 3 5 10 9 8 35 9
3 y 2 b 6 10 6 4 9 35 4
3 b 2 r 9 5 9 8 4 35 9
3 r 2 g 8 5 4 7 11 35 8
3 r 2 g 11 7 3 7 7 35 11
3 r 2 b 9 9 6 7 4 35 9
3 g 2 p 6 7 7 7 8 35 7
3 b 2 r 7 6 7 5 10 35 7
3 r 2 g 10 9 4 8 4 35 10
3 b 2 g 6 11 2 8 8 35 2
3 y 2 g 6 7 7 7 8 35 7
3 y 2 b 8 5 7 11 4 35 11
3 g 2 p 8 6 8 6 7 35 6
3 b 2 y 11 6 6 6 6 35 6
3 b 2 y 4 8 10 6 7 35 10
3 g 2 b 7 6 8 8 6 35 6
3 g 2 b 5 8 8 6 8 35 8
3 y 2 r 7 8 6 10 4 35 10
3 r 2 g 2 6 8 9 10 35 2
3 g 2 r 7 9 6 8 5 35 9
3 r 2 y 7 4 6 11 7 35 7
3 r 2 y 6 7 8 6 8 35 6
3 p 2 r 6 8 11 6 4 35 4
3 b 2 r 7 4 12 6 6 35 12
3 r 2 g 7 4 11 8 5 35 7
3 r 2 b 9 8 10 2 6 35 9
3 b 2 g 8 7 8 5 7 35 8
3 g 2 b 5 7 8 9 6 35 7
3 b 2 y 8 6 7 5 9 35 7
3 r 2 g 4 10 9 6 6 35 4
3 r 2 g 5 7 7 9 7 35 5
3 r 2 g 8 7 10 4 6 35 8
3 b 2 r 6 8 4 8 9 35 4
3 g 2 b 6 6 9 5 9 35 6
3 g 2 b 10 4 8 4 9 35 4
3 b 2 y 8 9 7 8 3 35 7
3 y 2 g 8 7 3 10 7 35 10
3 b 2 p 5 5 6 8 11 35 6
3 r 2 y 5 7 10 7 6 35 5
3 r 2 y 8 6 8 5 8 35 8
3 r 2 y 6 13 9 1 6 35 6
3 g 2 y 6 8 7 9 5 35 8
3 r 2 p 8 7 6 7 7 35 8
3 b 2 y 7 6 6 4 12 35 6
3 b 2 y 8 6 8 7 6 35 8
3 r 2 g 11 3 7 9 5 35 11
3 r 2 g 9 8 5 9 4 35 9
3 b 2 r 4 7 9 10 5 35 9
3 b 2 y 3 12 4 6 10 35 4
3 b 2 y 10 6 7 5 7 35 7
3 g 2 y 5 6 10 7 7 35 6
3 g 2 b 4 6 10 9 6 35 6
3 y 2 b 11 8 4 6 6 35 6
3 y 2 b 3 5 13 8 6 35 8
3 b 2 g 11 7 5 5 7 35 5
3 b 2 g 16 5 4 6 4 35 4
3 b 2 g 4 9 8 5 9 35 8
3 b 2 r 10 7 6 7 5 35 6
3 r 2 g 2 6 8 11 8 35 2
3 r 2 g 3 11 5 10 6 35 3
3 r 2 g 7 10 4 5 9 35 7
3 b 2 p 3 7 11 5 9 35 11
3 b 2 r 6 5 8 4 12 35 8
3 b 2 g 6 10 5 11 3 35 5
3 b 2 g 7 7 11 4 6 35 11
3 b 2 p 6 8 9 7 5 35 9
3 b 2 p 4 8 8 9 6 35 8
3 b 2 r 9 4 4 9 9 35 4
3 y 2 r 4 10 8 4 9 35 4
3 y 2 r 6 7 10 9 3 35 9
3 b 2 g 8 4 4 10 9 35 4
3 b 2 g 7 6 7 9 6 35 7
3 y 2 r 9 8 7 7 4 35 9
3 p 2 b 7 7 5 8 8 35 8
3 p 2 b 5 10 4 6 10 35 10
3 r 2 y 4 6 7 11 7 35 11
3 y 2 r 7 8 6 6 8 35 6
3 y 2 r 9 7 3 7 9 35 7
3 y 2 r 7 5 5 8 10 35 8
3 r 2 b 8 6 6 11 4 35 8
3 r 2 b 6 12 6 5 6 35 6
3 r 2 b 7 6 10 4 8 35 7
3 r 2 y 6 7 7 9 6 35 6
3 p 2 g 11 7 6 6 5 35 5
3 b 2 r 7 7 6 7 8 35 6
3 b 2 r 6 3 10 7 9 35 10
3 b 2 r 7 6 6 7 9 35 6
3 b 2 p 9 8 4 7 7 35 4
3 b 2 r 7 4 9 7 8 35 9
3 r 2 p 4 11 9 9 2 35 4
3 r 2 p 8 5 6 7 9 35 8
3 r 2 p 2 13 8 7 5 35 2
3 b 2 r 7 6 7 7 8 35 7
3 y 2 b 6 3 12 7 7 35 7
  3521 3458 3550 3529 3442 17500 3503
Est % 20.12% 19.76% 20.29% 20.17% 19.67%   20.02%
Est Tiles 7.04 6.92 7.10 7.06 6.88   7.01
10 Likes

Hey @Garanwyn

First of all thanks for putting in the effort and the hours for doing the analysis. As an IT professional I love looking at numbers and stats.

One quick question… would it not be better to use one stack colour say 3 dark 2 yellow and do a lot of raids using that and see how well the tiles generate. Right now we are analyzing stacking but the colour stack is varying. If tile generation algorithm is truely random then you could just be getting lucky with the boards being given.

I am not statistician by any means but I would love to hear your thoughts on this.

1 Like

You’re welcome! I’m glad you find it interesting.

With respect to your question about color stacking:

The hypothesis we’re trying to test here is whether the tile engine has a bias against color stacking, not whether it produces equal numbers of tiles of each color (I’ve never heard any theories that there might be a literal color preference in the tile engine, and I certainly don’t see one in my data).

To test this hypothesis, we can think of every tile draw on a board as falling into one of two categories:

  1. Tiles of the strong stacked color (whatever it might be)
  2. Tiles of a non-stacked color

Then each board gives us a binomially-distributed sample.

Unless you’re hypothesizing some sort of roving color bias or roving color-stacking bias, there’a no change in our chance of getting lucky across the samples by changing colors versus by sticking with a single color.

If you’re interested in the per-color estimates, here they are. The confidence interval bounds are pretty loose on some of them, though, since I don’t stack them often.

Color   95% Low Mid 95% High
Red        
Total Strong Tiles 1224 19.33% 20.35% 21.37%
Games 172      
         
Blue        
Total 1549 19.15% 20.04% 20.93%
Games 221      
         
Green        
Total Strong Tiles 239 16.90% 19.06% 21.23%
Games 36      
         
Yellow        
Total Strong Tiles 608 18.37% 19.78% 21.18%
Games 88      
         
Purple        
Total Strong Tiles 248 17.10% 19.24% 21.39%
Games 37    
3 Likes

Thanks man. I kinda reached the same conclusion after thinking about it last night. Your a legend. Keep up the good work and thanks for an awesome explanation.

3 Likes

In the theme of peer review:

Have you/others looked at distribution of color across the board (ie color stacking may produce the same quantity of X color, but are evenly distributed across the grid reducing possible early matches)?

Have you/others looked at the generated tiles beyond the initial board? No game is won on the opening board, does color stacking impact the tiles beyond the initial tiles spawned?

You should also run a rainbow control to confirm that you are receiving the expected 20% / color generation.

Edit: Very interesting and well executed. This is definitely a benefit to the community and something that I have yelled at my tablet numerous times for. I may owe it an apology.

2 Likes

Regional color density is an interesting question. The fact that SG throws out boards with initial 3-matches is going to alter (increase) the expected mixing somewhat. I haven’t looked at or tracked the available initial color matches on any given board, though.

The challenge is that I don’t have a good model for what the distribution of color densities across trials ought to look like, so I wouldn’t know what the data was really telling me even if I had it. I’m very open to suggestions on figuring this out.

Not yet. This is an open question. We’re trying to get someone to pick this project up, but so far no takers.

I can infer that they aren’t too bad based on my win rate being about 70%, even though I’m an average of 350TP lower than my opponents. But that’s a long, long way from a rigorous analysis :slight_smile:

I’m a bit less concerned about this, given how good the distribution between colors is even when stacking. My current heroes also aren’t really compatible with going rainbow and staying in Platinum, so this would be painful for me to do.

DoctorStrange is a dedicated rainbow player, though, as is Brobb. I believe General_Confusion and KingArchur also raid with rainbow teams. We might be able to talk one of them into screenshotting boards and uploading them. I’m not really sure the game is worth the candle, though.

Thanks! Yeah, I have the data, and I still yell at my phone. Gotta love when you go up against someone 700TP higher and get 2 tiles of your strong color :laughing:

4 Likes

Much respect for the data, i will also start to do screenshots. Atacking only mono black and mono blue. It will take some time to gather statistics but i will post it.
I see one problem is how too collect the data after initial board… a lot of work so too lazy.

What i really want to do is King on titans. When i see how oft there is only miss miss miss i am asking if there is really only 30% reduction of accuracy… btw, in raids or cws i think King is quite fair.

1 Like

So some info for 100 raids. Only mono color teams are used. Only enemies with weak center color are atacked. Only starting board is evaluated.
Number of stones of strong color - how many times.
Zero - 0
One - 0
Two - 1
Three - 6
Four - 10
Five - 10
Six - 19
Seven - 26
Eight - 8
Nine - 8
Ten - 6
Eleven - 3
Twelve - 1
Thirteen - 2

:wink:

2 Likes

The prediction would be for 43 worse-than-average starting boards, and you saw 46. So you had a little bad luck in there.

You also ended up with some pretty nice positive luck. Two 13-strong-tile boards in 100 samples is quite high. I’ve only seen 2 in my 600 boards recorded :slightly_smiling_face:

The stats on your data overall are:

Probablilty of strong tile: 19.6%

Agresti-Coull 95% confidence bound:

Low: 18.3%
Mid: 19.6%
High: 20.9%

So those numbers are very much in family with what I’ve collected, and consistent with a random draw process. Thank you for doing a collection!

6 Likes

Tnx,
the population of 100 is still too small, for five species and many different states. So i will continue to gather the data. Atm i do not like the difference between 5-6 and 8-9. The quite lucky 13th smooth the %. But i will continue to make screenshots and count. :wink:

3 Likes

The distribution is always a bit asymmetric because it caps out at zero at the bottom. I’m not worried about the differences between 5-6 and 8-9 just yet, but it is something to keep an eye on.

I look forward to seeing more data!

2 Likes

From what i have noticed from tile play is with a rainbow team the tile % is 20% chance for each tile not on the board. Which gives you a “even’ish” start of each color. Now from ever color you choose NOT to have in your lineup it drops the one color you dont choose by 4% and puts THAT 4% into the OTHER colors. So a 2-1-1-1 team with 2 red, 1 green, yellow, blue will be 16% red chance and 21% chance other colors. 2red-2blue-1yellow would be 12%red, 12%blue, 20%yellow, 28%green, 28%purple. 4red-1blue is 8%red, 20%blue, 24% green, purple, yellow. 5 color mono is the lowest chance % of course. 4%purple, 24% chance for a red, green, yellow, blue. Weirdly after i lose and REMATCH it seems the mono color chance goes up sometimes and i MURDER the opponent sometimes lol. So its still CHANCE is the most important. No cheating far as i see but the % does get less the more you choose same color. Hope this helps. Good luck :+1:

When you say “noticed”, do you mean that you have actually collected a reasonably large amount of data?

I ask because the effect you’re describing certainly isn’t present in the data I’ve collected and published above. So I’d be very interested to see data representing what you’re describing.

4 Likes

Oh not a large amount by myself. Me and some group members did a 20 battle test after i first started playing because i was confused why when i didnt have a hero the tiles for that color was lower than the rest. So yes we did a person data for
alliance conversion but not like your full on data. So maybe the numbers would have changed over a LARGER scale. So i recommend your data. So i will put in big words that its my knowledge so i dont confuse people. Thanks for bringing that to my attention :+1: But i will do a large scale on this and see if it still adds up. 100 matches you think is good? I will start there.

So results for 201 atacks.
Zero - 0
One - 0
Two - 2
Three - 8
Four - 19
Five - 25
Six - 32
Seven - 44
Eight - 26
Nine - 19
Ten - 12
Eleven - 8
Twelve - 4
Thirteen - 2

It looks now more or less good. So i will stop collecting the data.

HOWEVER i will start collecting the data for mono colors in wars! Recently i started to use mono color. Last 3 wars i had one 7 one 6 and the rest 2-4. Looks very suspiciuos. Even for a small population.

3 Likes

Cookie Settings