Slot Statistics - Critiques Requested

I like to think of Variance in terms of how big a bankroll will I need to get close to the theorhetical RTP. For games that a very large part of the payback is on bonus rounds, if the bonus round doesn't usually trigger in around 140 spins or so, that means it is pretty high variance.

For games which produce decent line wins, you might play a few hundred spins and still be close to even, without a bonus round.

Let's take a page from one of Enzo's past posts, and use a slot machine with every spin costing $1 and paying back .99 cents. That would be the ultimate in low variance, but guarantees every session will be a losing session.

At the other end of the spectrum is a slot that costs $1 to play, and pays a jackpot of $99,000 on average once every 100,000 spins, and no other wins are possible. But because of it's random nature, you might hit that Jackpot win on the very first spin, and you might hit it again the very next spin even. Or you could spin 200K times and never hit it.

If it was $1 a spin and hit on average once every 100 times and paid $99, that would be very medium variance IMO, but another player might consider it low variance. If I usually only play with $10, I might feel it's high, if I have $1,000 bankroll, I'm sure I would consider it low..

All the above examples produce payback of 99%

Of course few games play quite that way, offering intermediate wins that keep you playing. The more smaller wins you have to increase the number of spins you get to hit that jackpot or hit that bonus round, the lower the variance as I see it.

Oh, and by the way, I don't have that big a problem with "winning" 60 cents when I bet $1. I spent the $1 to take a chance to win, and winning 60 cents is better than nothing.
 
Jasminebed,

Thank you VERY MUCH for taking the time to provide all of those words.

I've discovered that slot variance is such a subjective thing - player to player, game to game (day to day?). And 10 years (20 years?) of history has led to the creation of a whole lot of words to try to describe this subjective thing.

But all slots have a variance value that is a number - a calculated, hard, nothing-subjective-about-it number. It would appear that Galewind Software is the first in line to actually publish those numbers.

So we find ourselves in the difficult position of publishing hard numbers into a well-developed "subjective words dictionary" world. I've published a variance number of 35. What does that mean? It's not in the dictionary.

I've decided, for the time being anyway, that it doesn't mean anything at all. Hmm. Let me re-phrase that. It means whatever anyone wants it to mean.

Again, thank you for all of your words.

I assume that your final paragraph was in reference to my "rant" on "WIN!" displays. Well, it's a pet peeve of mine - we've all got 'em. Each to his own I guess.

Chris
 
Even if it is subjective whether a player views a slot's variance as low medium or high, the numbers will provide a guideline to allow a player to know if a game (especially an unfamiliar one) is higher or lower than others.

Jasminebed,

I agree. I plan to publish this page (the one that has been linked in this thread) shortly, so the numbers will be publicly available (to that small percentage of Players that use the Help system anyway).

What I have decided not to do, for now, is to assign any words to those numbers. That is, Crazy 8 Line has a variance of 23, Gems of Isis is 51, and Silver Spurs is 106. I'm not going to jump the gun and assign words to those numbers until I feel more comfortable that any word association I make is consistent with something/anything.

But, to repeat, the numbers are going to be published and available.

Chris
 
ChopleyIOM,

BTW, the changes that you suggested concerning our choice of words regarding the wild symbol not appearing on reel 1 during Free Spins (remove the "There are no multiple wild stops in a free spin" part) is up on both the popup paytable graphics as well as in the Help file text.

Again, thanks for that. You were right - it did sound harsh.

Chris
 
I noticed being mentioned in this thread so I thought I pop in and say my two cents worth.

First of all, I find it great that Chris did all this work to increase transparency of their slots performance to players. In his previous thread about Return percentages a few people wanted slot variance data to be published and it shows a great deal of devotion to players that he then spent all the time and money to deliver this data to players. I haven't seen a software provider to cater for player's wishes in this way before, so kudos for that!

I am a bit disappointed though, because I have read all those threads and long discussions by several members on this forum wishing that casinos increase their transparency and now that something like that is finally done in this thread, most of these posters haven't even bothered to come here to reply (VWM where are you lurking?)

Like Chris mentioned, based on our discussions he added a couple of new details to the slot performance help-page. Now would be also the perfect opportunity for the people here to suggest an interesting slot stat they would like to see published. So is there some other statistic you would like to see on the help-page that hasn't been mentioned yet? Perhaps Chris will be open to adding it to the page as well.
 
I'll cover some issues that have been discussed in the thread.

To the topic of: What is the real meaning of standard deviation / variance value and can you assign some descriptive words to it?

Like it was mentioned it is difficult to compare variances of slots and assign words to these values because these haven't been published for other softwares (I'll state the variance values of two Microgaming slots below though). But for most other game types standard deviations are known and you could compare the standard deviations of slots to these games. For example, Jacks or Better video poker has a standard deviation of 4.42, so if you play a slot with standard deviation close to 4.4, in a way you'd expect the same amount of swings than playing Jacks or Better at the same bet size.

It's important to note here that many quite different kinds of games can have similar standard deviation value, so the standard deviation or variance value by itself may not accurately describe the "feel of the game". For example the following bets:

1) Playing single-zero roulette by betting $5 on single number every spin
2) Playing Joker Poker video poker betting $5 per hand
3) Playing Galewind's Lucky Lantern slot at $5 per spin

All of the above bets have a standard deviation value close to 6, ie. the standard deviation of the above bets is ~$30 per round. But the games are quite different and the "feel" is also quite different between them. Yet, in a somewhat abstract sense, they all deliver the same amount of "swings" as the variance between them is nearly the same. The point is that it would be overly optimistic to think that the "whole feel" of the game could be captured in a single value called variance.

Now the topic of comparing Galewind's slot variance data to other softwares.

Unfortunately there's not much info available on the subject (as other sw providers haven't published these). However, I have been working on a slot simulator tool which enables me to measure variance values for any slot that I have obtained the reel layouts for. So far I have done this for only two Microgaming slots but I guess this is better than nothing. The results are:

Loaded slot [2x multiplier choice in free spins, maximum 25 lines]: Standard deviation based on 10 million simulated spins = 5.0
Thunderstruck Slot [maximum 9 lines]: Standard deviation based on 10 million simulated spins = 6.5

So now we have some reference values about two MG slots. People are more familiar with these slots than Galewind's slots, so do you consider these two slots as low, medium or high variance? One reference: KasinoKing's website (
You do not have permission to view link Log in or register now.
) describes 'Loaded slot' as Medium variance and 'Thunderstruck slot' as Low variance. There is inconsistency here in KK's assignments, as Thunderstruck has larger standard deviation than Loaded ;) Anyway if the general consesus is that, for example, both of these slots are Medium variance, then we would at least have a starting point to assign descriptions to standard deviation values, such as: "slots with standard deviation between 5-7 fall to Medium variance category."

More to come later...
 
Last edited:
Something else about which I have been thinking since I first started to organize all of this data:

- A 20-line bet, $1 per line, one line wins and pays $5.

- A 20-line bet, $1 per line, five lines win, each paying $1 for a total return of $5.

I'm pretty sure (I rechecked the Variance equation in my first post) that there is no difference in Variance here. Or rather, if there were a sufficient number of these types of conditions within the sample set, there would be no impact on the resulting Variance calculation/result.

However, I think that there would be a big difference in how the Player perceives the Variance. I think that the single-line win would be perceived as a higher Variance, and the multi-line win would be perceived as a lower Variance.

Obviously, a sample of 1 would have little if any impact. But if, purely by coincidence, a number of these samples occurred, especially if they occurred within a short span of time, then I think that would impact the Player's opinion.

"You know, the first time I played this slot I thought that it was Low Variance, but I just played it again and now I'm thinking that maybe it is more Medium Variance."

Here is the list, sorted by the Standard Deviation lowest to highest:

Take It or Leave It: 4.1

Hot Peppers: 4.7

Crazy 8 Line: 4.8

Diamonds Wild: 4.8

Lucky Lanterns: 6.1

Gems Of Isis: 7.1

Golden Goal: 7.4

Bowl-a-Palooza: 7.7

The Red White & Blue: 8.0

Silver Spurs: 10.3

Bells & Whistles: 16.2


Chris
 
Unfortunately there's not much info available on the subject (as other sw providers haven't published these). However, I have been working on a slot simulator tool which enables me to measure variance values for any slot that I have obtained the reel layouts for.

So far I have done this for only two Microgaming slots but I guess this is better than nothing. The results are:

Loaded Slot: Standard Deviation (SD) = 5.0, Variance = 25.

Thunderstruck Slot: SD = 6.5, Variance = 42.

So now we have some reference values about two MG slots. People are more familiar with these slots than Galewind's slots, so do you consider these two slots as low, medium or high variance?

KasinoKing describes Loaded (Variance = 25) as Medium Variance and Thunderstruck (Variance = 42) as Low Variance.

There is inconsistency here in KK's assignments, as Thunderstruck has larger standard deviation than Loaded. ;)

Anyway if the general consesus is that, for example, both of these slots are Medium variance, then we would at least have a starting point to assign descriptions to standard deviation values, such as: "slots with standard deviation between 5-7 fall to Medium variance category."

More to come later...

I have reformatted your post so that I might better understand its significant points.

What is your level of confidence in the accuracy of the reel layouts for the 2 MG slots which you reference?

@KK - What criteria did you apply which resulted in your assigning a Low Variance to Thunderstruck and a Medium Variance to Loaded? I confess that I am unfamiliar with the site which Jufo has linked, but I will head over there shortly.

Chris
 
What is your level of confidence in the accuracy of the reel layouts for the 2 MG slots which you reference?

I'd say my confidence is moderately high. The reel layouts for different Microgaming slots were originally "cracked" in this thread dating back to 2007. These reels also produce the exact same RTP that the software provider has published.

@KK - What criteria did you apply which resulted in your assigning a Low Variance to Thunderstruck and a Medium Variance to Loaded? I confess that I am unfamiliar with the site which Jufo has linked, but I will head over there shortly.
Chris

What makes Thunderstruck to have higher variance is the restriction to maximum 9 lines. This means fewer line hits that are larger in magnitude. I think the "default" approach in ranking slots according to their variance would be assuming that maximum number of lines are always played as most players prefer to play maximum lines. However, if you only played 9 lines in Loaded (rather than the maximum 25), Loaded seems to have about the same standard deviation value (~6.5) as Thunderstruck with 9 lines.
 
I was confused by the last paragraph, so here is more data:

Game: SD: Line Count

Take It or Leave It: 4.1: 27

Hot Peppers: 4.7: 20

Crazy 8 Line: 4.8: 8

Diamonds Wild: 4.8: 1

Lucky Lanterns: 6.1: 20

Gems Of Isis: 7.1: 20

Golden Goal: 7.4: 20

Bowl-a-Palooza: 7.7: 9

The Red White & Blue: 8.0: 1

Silver Spurs: 10.3: 9

Bells & Whistles: 16.2: 5

I don't see any correlation between line count and SD.
 
I was confused by the last paragraph, so here is more data:

Game: SD: Line Count

Take It or Leave It: 4.1: 27

Hot Peppers: 4.7: 20

Crazy 8 Line: 4.8: 8

Diamonds Wild: 4.8: 1

Lucky Lanterns: 6.1: 20

Gems Of Isis: 7.1: 20

Golden Goal: 7.4: 20

Bowl-a-Palooza: 7.7: 9

The Red White & Blue: 8.0: 1

Silver Spurs: 10.3: 9

Bells & Whistles: 16.2: 5

I don't see any correlation between line count and SD.

The above list seems to assume maximum number of possible lines played in every slot. The last paragraph in my previous post addressed how the SD value of the slot increases when you play fewer than maximum number of lines. The correlation between # of lines and standard deviation was actually going to be my next topic. I'll get back to it tomorrow.
 
What I was getting at in my previous posts was the relationship between slot's SD value and the number of lines. Chris, you might think this is adding confusion, but IMO this is relevant and needs to be addressed. I hope discussing this clears the confusion you had from my previous posts.

In the helpfile, Galewind's Hot Pepper slot is listed to have SD of 4.7 with maximum 20 lines played. Below are the SD values of this slot with different number of lines played, obtained from the slot simulator by simulating 20 million spins for each choice of # of lines. (I hope Chris that it is ok to post these):

Hot Pepper Slot - 15 lines = ~4.9
Hot Pepper Slot - 10 lines = ~5.3
Hot Pepper Slot - 5 lines = ~6.6
Hot Pepper Slot - 3 lines = ~8.7

'~' means "approximately". I had to use this notation because even in 20 million spins the estimated standard deviation values have decent margin of error.

Chris, you can verify the above numbers with the copy of the sim I sent you. (Shame on you if you haven't tried it already!)

What the above results show is that if you play Hot Pepper slot with only 3 lines it changes from lower-variance slot to much higher-variance slot. In fact you would have larger swings than by playing 20 lines in Gems of Isis (SD 7.1). This is of course assuming you keep the total bet size constant.

So what I was getting at in my previous posts, was that if you would rank the slots by choosing, say, 9 paylines for every slot (exluding those slots with fewer than 9 lines) then the order of the slots according to their variance might be completely different than what you have right now. Therefore your assignment for the number of lines affects the result of ranking. I wrote before that the "default" (most sensible) assignment would be to always choose max. lines when comparing variances but it's not the only possible choice.

I wrote before that Microgaming's Loaded slot has SD of 5.0 with max. 25 lines but SD of 6.5 with 9 lines. Thunderstruck has SD of 6.5 with 9 lines (=max.lines). I was just trying to see if the number of chosen lines explains KK's assignment of Thunderstruck being lower-variance than Loaded but the numbers didn't support this, so KK's assignment seems to be simply wrong.

This brings us to the topic of how players can control their variance

Even though players cannot influence how the slot operates and have to take it's behaviour as given, they do have plenty of control over the variance which is the topic of this thread. They have two parameters to control the variance: bet per line and number of lines. And these work differently.

Standard deviation is linear in terms of total bet size, so to have linear control over the variance, the players can simply lower or increase their bet size to get the desired variance.

For example, playing Gems of Isis (20 lines) $1 per spin gives a standard deviation of $7.1 per spin. Doubling the bet size to $2 per spin doubles also the standard deviation to $14.2 per spin. Notice however, that doubling the bet size quadrubles (not doubles) the variance.

However, players may seek higher variance but aren't comfortable with increasing their total bet per spin. They can accomplish this by switching from a lower variance slot to a higher variance slot, but they could also simply reduce the number of lines while keeping the total bet size constant. Like shown above, even a lower variance slot like Hot Pepper becomes rather high variance slot when you reduce the number of lines from 20 to 3. This means that despite the slot's behaviour being "cast in stone" there's a plenty of room for players to obtain a desired level of variance.

However, I often hear that players are not very willing to play fewer than maximum number of lines. This seems to relate to the feeling of frustration if there happens to be a big payout on a payline that wasn't chosen. But at the same time they don't consider that playing fewer than maximum lines also saves them money every time nothing hits on those paylines, so it all evens out.

I threw this out here because it would be interesting to hear opinions from players why they like/don't like to play fewer than maximum lines.
 
Well, the first thing I'm going to have to do is to add the qualifier to the Slot Stats file that all Standard Deviation and Variance values are based on the use of max lines wagered, min bet per line wagered.

Also, it is completely OK for you to publish any numbers that you wish concerning the Hot Peppers slot.

And yes, I hang my head in shame that I have not yet tried your sim. I offer no excuses.

Now, as to my statement that you will create more confusion than clarification:

Consider that the objective in all of this is to create for the Player some way to compare the Variance of all of the slots in a collection.

That is, a "recreational" Player who wants to maximize their spin time but is on a limited budget is going to want to play a slot with a lower variance. Someone with a larger stake who is looking for the "big score" and can afford the potential spin time needed to get it is going to be looking for a higher Variance slot.

Since both of your modifications (line bet count and line bet amount) increase the slot's Variance from its Theoretical Minimum, then one can say that the most effective way to allow a Player to compare one slot to another is to publish the minimum variance values, which will be realized with minimum bet amount and maximum line count.

I Started a Thread on this, and (perhaps by coincidence) this was the default - on load - configuration which won the vote.


It just occurred to me - perhaps I should refer to my currently published SD and Variance values as:

Theoretical Minimum Standard Deviation

Theoretical Minimum Variance.


Your data (modify the line count, modify the line bet) demonstrates how you can change an orange into an apple.

1. I agree it is relevant.

2. I agree that it is interesting, especially given the fact that with your sim, and absolute confidence in at least the Hot Peppers reel layout, you can play around with all sorts of different configurations and quickly see the quantitative impact.

3. If you think that this hasn't added a layer of confusion for the "recreational slot Player" then you, my friend, have been spending way to much time with that sim.


Why I agree that it is relevant. There were a couple of statements earlier in this thread by zap987 concerning the Gems of Isis slot:

Depends on the reel layout but the max possible win if you hit the top symbol is going to be about 700-1000xbet and that is the best you can ever hope for. Of course 1000x bet is a very good win and variance isn't only in how big wins you can get but also how often they happen and having played a lot of spins in fun mode it just feels like low to medium variance compared to real high variance slots. Just too much of the RTP in small hits and not enough in the big wins.

Sadly that does mean that it's another low variance slot ...

Both of these seem to indicate that zap987 is looking for a high-variance slot, and the default configuration of Gems is not meeting that requirement. Your numbers clearly demonstrate that he can achieve that increased variance in Gems by doing either of the modifications you've suggested above.

BTW, the one time that I suggested (even just as a question, rather than a statement or a directive) dropping the line count, I got Thoroughly Spanked by VWM.

Chris
 
I don't have much to add except a couple of short comments.

Consider that the objective in all of this is to create for the Player some way to compare the Variance of all of the slots in a collection.

That is, a "recreational" Player who wants to maximize their spin time but is on a limited budget is going to want to play a slot with a lower variance. Someone with a larger stake who is looking for the "big score" and can afford the potential spin time needed to get it is going to be looking for a higher Variance slot.

Since both of your modifications (line bet count and line bet amount) increase the slot's Variance from its Theoretical Minimum, then one can say that the most effective way to allow a Player to compare one slot to another is to publish the minimum variance values, which will be realized with minimum bet amount and maximum line count.

I Started a Thread on this, and (perhaps by coincidence) this was the default - on load - configuration which won the vote.

Yes I agree, Max lines (or Minimum Variance like you put it) seems to be the best way to rank slots according to their variance. Mostly because this is the setting that the vast majority of players are using anyway.

It just occurred to me - perhaps I should refer to my currently published SD and Variance values as:

Theoretical Minimum Standard Deviation

Theoretical Minimum Variance.

I don't think that's necessary. Many people wouldn't see the connection to max lines and would be confused what the above words refer to.

3. If you think that this hasn't added a layer of confusion for the "recreational slot Player" then you, my friend, have been spending way to much time with that sim.

Heh maybe my last bit did add confusion. But since I finally came here to write to this thread I thought that I will try to bring some new concepts to the table, even with the risk that the discussion is going to get a bit more involved. I wonder if anyone besides me and you is even reading this thread anymore :o

Both of these seem to indicate that zap987 is looking for a high-variance slot, and the default configuration of Gems is not meeting that requirement. Your numbers clearly demonstrate that he can achieve that increased variance in Gems by doing either of the modifications you've suggested above.

BTW, the one time that I suggested (even just as a question, rather than a statement or a directive) dropping the line count, I got Thoroughly Spanked by VWM.
Chris

Actually, I forgot to add to my last post that there is one crucial difference between increasing variance by increasing bet size and increasing it by dropping line count. If you increase the variance by doubling your bet size, the house edge per spin will double as well. At $1 per spin, each spin costs you 3 cents on average, but at $2 per spin each spins costs you 6c on average. However by dropping line count you can keep the bet size and house edge per spin constant and achieve increased variance for free. Therefore it's more "cost-efficient" to do it by dropping lines than by increasing bet size. This observation relates to the same concept mentioned earlier in this thread that a gambler should sometimes seek to minimize "house edge/variance" rather than house edge only. However, people value many different things when playing, such as the longitude of game play, and mathematically the most "effective" way of playing (that seeks to minimize long-term losses) is usually not the most entertaining.
 
OK, I added the line bet/lines bet disclaimer to the top of the file.

You may be right - we're the only ones in this conference room. Well, screw it - let's get a keg in here.

In addition to seeing it in the equations, I can also intuitively understand the increase in Variance associated with a decrease in paylines wagered.

However, I can't see the same thing with the bet amount. The SD applies to the RTP. RTP is a ratio. That is, bet 1000 and lose 100 is the same as bet 10 and lose 1. So, I can't intuitively understand how increasing the bet per line increases the SD.

I can see that it increases the absolute value of win/loss. Again, the example above - lose 100, or lose 1.

However, since SD (and thus Variance as SD squared) is relative to RTP, and RTP is not affected by bet size (as above) - explain to me what I'm missing here.

Chris

Edit: Reminder to self. Do the HE/Variance calculations to determine Galewind's "best" slot.
 
OK, I added the line bet/lines bet disclaimer to the top of the file.

You may be right - we're the only ones in this conference room. Well, screw it - let's get a keg in here.

In addition to seeing it in the equations, I can also intuitively understand the increase in Variance associated with a decrease in paylines wagered.

However, I can't see the same thing with the bet amount. The SD applies to the RTP. RTP is a ratio. That is, bet 1000 and lose 100 is the same as bet 10 and lose 1. So, I can't intuitively understand how increasing the bet per line increases the SD.

I can see that it increases the absolute value of win/loss. Again, the example above - lose 100, or lose 1.

However, since SD (and thus Variance as SD squared) is relative to RTP, and RTP is not affected by bet size (as above) - explain to me what I'm missing here.

Chris

Edit: Reminder to self. Do the HE/Variance calculations to determine Galewind's "best" slot.

Hmm there seems to be some severe misunderstanding of concepts here as I don't understand too much of your post. Well, since I am a teacher let's try to go forward...

Referring to your post above, house edge and variance/SD are independent parameters. They don't relate to each other. So I don't understand or agree with the statement "The SD applies to the RTP".

For example, the SD value of 4.7 for Hot Pepper slot (with 20 lines) means: standard deviation relative to unit wager, in other words unitary or normalized value. If you bet $1, the SD is $4.7 per spin (notice that the value will have a unit of dollar, it's not a unitless parameter). If you bet $2 a spin, the SD is obviously $2*4.7 = $ 9.4 and so on. The unit SD value never changes, ie. it is always 4.7 but you obviously multiple this 4.7 by your bet size ($1, $2 or whatever). This means that if you double your bet, the standard deviation value of the bet ($4.7 or $9.4) doubles, while the "SD per unit bet" 4.7 always remains constant.

In contrast, when decreasing the number of lines, the "SD per unit bet" 4.7 is not a constant anymore, but varies according to number of lines. So these two settings (total bet amount and number of lines) affect the actual SD of the bet in two different ways.

Some examples of different bets in Hot Pepper slot:

#1
20 lines, $5 bet per spin ($0.25 per line):

SD = $5*4.7 = $23.5 per spin

#2
5 lines, $3 bet per spin ($0.60 per line):

SD = $3*6.6 = $19.8 per spin

#3
15 lines, $15 bet per spin ($1 per line):

SD = $15*4.9 = $73.5 per spin

It's these SD values that the player will observe (expressed in dollars), not the "SD per unit"-values in the help-file.

Now I am not sure if this was the exact thing what you tried to address in your post, but if it was, I hope this clears it up.
 
Last edited:
You may be right - we're the only ones in this conference room. Well, screw it - let's get a keg in here.

I'm still reading, although the maths is starting to go over my head a bit :confused: :D
 
Jufo,

For ease of access, Link Removed ( Old/Invalid) to the page in question.

Note my Definition of Terms for Standard Deviation. What you have just described completely contradicts everything in here. I'm OK with the contradiction. My goal is an accurate publication.

(Technically, we did not calculate the SD. We calculated the Variance (using the equation in my first post), and then just took the square root of that for the SD.)

Assuming SD as it applies to a normal distribution curve, what are your X and Y axis labels?

Given your example #1 below:

20 lines, $5 bet per spin ($0.25 per line): SD = $5*4.7 = $23.5 per spin. Given that 1 SD is "66% of the area under a normal curve", are you saying that 66% of the time your win on this spin will fall between what? 0 and $23.5?

Chris
 
Jufo,

For ease of access, Link Removed ( Old/Invalid) to the page in question.

Note my Definition of Terms for Standard Deviation. What you have just described completely contradicts everything in here. I'm OK with the contradiction. My goal is an accurate publication.

(Technically, we did not calculate the SD. We calculated the Variance (using the equation in my first post), and then just took the square root of that for the SD.)

Ok I am trying to see what you mean. No, what I described doesn't really contradict what's in the page, it's more like what I described is not mentioned in the page.

What it says at the top:

"Please Note: The Standard Deviation and Variance values reported are based on using the minimum bet amount per payline, and the maximum payline count per slot."

This line should say something like

"Please Note: The Standard Deviation and Variance values reported are based on using unit bet size, and the maximum payline count per slot.

Now if you choose 20 lines at 0.05 coin (minimum coin) each spin happens to be at unit bet size, so your definition is accidentally correct but this is not what the published SD values really mean.

Note, that in the equation to calculate variance in the first post you are dividing the sum of payouts by total bet size, so in other words you are normalizing variance relative to unit bet size in that calculation. The equation in the first post is precisely the same one as the one my simulator uses for calculating the variance (and SD).

A bit further down it reads:

"The Standard Deviation values are associated with the Theoretical RTP values. They are an indicator of how far above or below these Theoretical RTP values your results might be given a reasonably large sample size."

This is correct in the sense that you can use the combination of both T-RTP and SD to measure how far the actual results will be from T-RTP in a sample. But SD and T-RTP are still independent parameters even though you can use them together to perform such a calculation.

I think it might be a good idea to alert Jacobsson and/or the Wiz to this thread, because they might be able to clear the confusion. They might have a different interpretation of these definitions than what I have presented.

Assuming SD as it applies to a normal distribution curve, what are your X and Y axis labels?

Given your example #1 below:

20 lines, $5 bet per spin ($0.25 per line): SD = $5*4.7 = $23.5 per spin. Given that 1 SD is "66% of the area under a normal curve", are you saying that 66% of the time your win on this spin will fall between what? 0 and $23.5?

Chris

To use normal distribution curve requires that the result is normally distributed in the first place. The result of one spin is definitely not normally distributed, so you can't apply "66% of the area under a normal curve" to one spin, you can only do it to a sample of relatively large number of spins (
You do not have permission to view link Log in or register now.
)

It's funny because this was exactly the topic that I was going to address next, ie. answer the following questions: how to use the known SD values to determine the range of results for a sample of N spins and how large sample (variable N) is needed so that the resulting distribution is close enough to normal distribution, so that such determination is mathematically valid? I warn though that the number of readers of this thread might drop to negative numbers.
 
Last edited:
Jufo,

You are having fun with that sim, aren't you?

Actually, it was Eliot that wrote the section on Standard Deviation. It was also Eliot that suggested the addition of the chart in the Line Pay Hit Frequency section. (Your suggestions resulted in the addition of 2 more columns to this chart.)

If you could explain to me what "Unit Bet Size" means, then maybe I can make a modification to that introductory sentence that doesn't use that phrase, so that the average reader can understand.

It is my understanding that SD has to be associated with something. Example:

Measure the height of 10,000 men, then plot the results on a graph where the x axis is count and the y axis is height. (Assume a Standard Distribution, which is a reasonable assumption.) The peak of the curve is the mean, and 1 SD either way means that 33% of the sample is less than the mean and 33% of the sample is greater than the mean. (Total of 66% of the area under the curve.)

So, in this case SD is associated with the distribution of Height values.

If I publish SD values, what is the SD associated with? SD is associated with a distribution of values - to what distribution of values is this SD associated?

Re: your last paragraph. You're right - if ever there was a thread killer then that sounds like it. Personally, I'm going to vote that the required number will be smaller than expected. For Hot Peppers - 6,478. (Of course, that depends on the definition of "close enough to normal distribution".)

Chris
 
Jufo,

You are having fun with that sim, aren't you?

I don't follow you. In my previous post I clarified the confusion and severe lack of understanding you have with basic concepts, and explaining these concepts to you has absolutely nothing to do with the sim. So I really don't get or appreciate the above remark.

Actually, it was Eliot that wrote the section on Standard Deviation. It was also Eliot that suggested the addition of the chart in the Line Pay Hit Frequency section. (Your suggestions resulted in the addition of 2 more columns to this chart.)

There is nothing wrong with that section if you are trying to explain a complicated subject in two sentences. But it doesn't mean that all there is to be said about variance is there. You should still alert Eliot to this thread if you want to have assurance that what I have written is accurate, because you certainly seem to question it at every turn.

If you could explain to me what "Unit Bet Size" means, then maybe I can make a modification to that introductory sentence that doesn't use that phrase, so that the average reader can understand.

Sigh, what else can I say? "Unit bet size" is a bet of one unit. If I go to foreign exchange and change 100 euros to 150 dollars then the unit conversion rate is 1.5. It's not 150 even though I got 150 dollars and not 1.5 dollars. Are you being thick here on purpose?

It is my understanding that SD has to be associated with something. Example:

Measure the height of 10,000 men, then plot the results on a graph where the x axis is count and the y axis is height. (Assume a Standard Distribution, which is a reasonable assumption.) The peak of the curve is the mean, and 1 SD either way means that 33% of the sample is less than the mean and 33% of the sample is greater than the mean. (Total of 66% of the area under the curve.)

So, in this case SD is associated with the distribution of Height values.

If I publish SD values, what is the SD associated with? SD is associated with a distribution of values - to what distribution of values is this SD associated?

The SD is associated with the distribution of payouts that a single spin generates. When you play one spin you can win 0, or 10, or 200, or 2000 or 50000. SD measures how wide this range of possible payouts is, and the SD values in your help-file measure the range of payouts relative to unit bet size.

On further thought variance is always measured around the mean value (or average value) so in this sense, what I wrote before about SD and RTP being completely independent paramaters is inaccurate. Since variance is centered around the mean (or RTP), it's definitely correct to say that "variance is associated with RTP". So I stand corrected, that part of my previous post was wrong.

Re: your last paragraph. You're right - if ever there was a thread killer then that sounds like it. Personally, I'm going to vote that the required number will be smaller than expected. For Hot Peppers - 6,478. (Of course, that depends on the definition of "close enough to normal distribution".)
Chris

If you expect the number to be 6478 then why did you try in your previous post use normal distribution approximation to one spin? Some of stuff you write just don't make sense.

Upon further thought, the results in the next topic might actually be interesting to regular players. The results will, for example, answer the following question: If I play 2000 spins at specific slot, what is the probability that I will be ahead after those 2000 spins? I will calculate this probability both by using the normal distribution curve (= measuring the number of SDs from the T-RTP) but also by using the simulator. I anticipate that the simulator will give results that will shock you, and make a section of your help-file questionable if not plain wrong.
 
Last edited:
Jufo,

First of all, I sincerely apologize for obviously really pissing you off. I don't know what else to say. I'm sorry.

I said "having fun with that sim" because I assumed that the work which you outlined in your last paragraph:

... how to use the known SD values to determine the range of results for a sample of N spins and how large sample (variable N) is needed so that the resulting distribution is close enough to normal distribution, so that such determination is mathematically valid?

would be long and tedious otherwise. Being long and tedious you would thus be disinclined to pursue it. I then assumed that the availability of your sim would make this work much easier, to the extent that generating statistical results which I agree would be very interesting would be "fun". Thus - "having fun with that sim". That's all I meant by that.

BTW, Eliot is in Cambodia right now, attending/speaking at a conference on Casino security.

Again, I apologize for making you so angry, or feeling so insulted.

Chris
 
Jufo,

First of all, I sincerely apologize for obviously really pissing you off. I don't know what else to say. I'm sorry.

I said "having fun with that sim" because I assumed that the work which you outlined in your last paragraph:



would be long and tedious otherwise. Being long and tedious you would thus be disinclined to pursue it. I then assumed that the availability of your sim would make this work much easier, to the extent that generating statistical results which I agree would be very interesting would be "fun". Thus - "having fun with that sim". That's all I meant by that.

BTW, Eliot is in Cambodia right now, attending/speaking at a conference on Casino security.

Again, I apologize for making you so angry, or feeling so insulted.

Chris

Ok I see, thanks for clearing this up. It just felt like I am spending so much time and doing so much effort in trying to explain things and bring new things to the table only to receive sarcastic remarks as a result. But if that's what you meant by "having fun with the sim" then I took it the wrong way and I am fine.
 
I am sorry Chris for being quite tempered in my previous post, and saying things I didn't mean to. I would completely re-write that post but I can't edit it anymore. There were some other things in my life, not related to this thread or you, that had made me very frustrated and annoyed, and when I came back to this thread I became even more frustrated because it felt like the discussion was going nowhere. I felt like trying to swim upstream and only going backwards. So I lost my cool there a bit and I apologize to you. I hope you accept it.

I re-read the last few posts and I understand better now what you were getting at with those questions. I was wrong when I wrote that the SD or Variance doesn't apply to the RTP. It's very definition is to measure the amount of spread around the mean (and here mean = RTP). So the part in the help-file "The Standard Deviation values are associated with the Theoretical RTP values. " is entirely correct.

The only part where we didn't reach consesus is how to express the "relative to unit bet" best. The concept is trivial, like I tried to express with the money exchange example
but it sounds more complex when you try to explain it. I see now that the page reads:

"Please Note: The Standard Deviation and Variance values for each slot were calculated based on a total bet per spin of 1 unit (1 dollar), and on using the maximum payline count per slot."

and to me this seems just fine.

When describing variance, the page could also say something like: "Variance measures the amount of spread of the results. The larger the variance the more spread out the outcomes will be." This would be a description that is easy to understand. But it kind of says the same thing where describing Low, Medium and High variance slots. At the moment I have no other suggestions.

A player wrote to me by PM that inspired by this thread he had played only 3 lines on a high-variance slot. I wish to stress that this thread is for theoretical discussion and in no way advocating people to use such high risk betting styles.
 
Last edited:
Jufo,

For ease of access, Link Removed ( Old/Invalid) to the page in question.

Note my Definition of Terms for Standard Deviation. What you have just described completely contradicts everything in here. I'm OK with the contradiction. My goal is an accurate publication.

(Technically, we did not calculate the SD. We calculated the Variance (using the equation in my first post), and then just took the square root of that for the SD.)

Assuming SD as it applies to a normal distribution curve, what are your X and Y axis labels?

Given your example #1 below:

20 lines, $5 bet per spin ($0.25 per line): SD = $5*4.7 = $23.5 per spin. Given that 1 SD is "66% of the area under a normal curve", are you saying that 66% of the time your win on this spin will fall between what? 0 and $23.5?


Chris
There is an error in this page:

"For example - using the Crazy 8 Line slot, with a Theoretical RTP of 97.47%, and a Standard Deviation (SD) of 4.8:
Let's say that we have 100 Players, each of whom has played 2,000 rounds of Crazy 8 Line. The Player RTP results that we might expect to see are:

About 68 Players will have an RTP in the range 97.47 +/- 1 SD = 92.7% to 102.3%.
About 95 Players will have an RTP in the range 97.47 +/- 2 SD = 87.9% to 107.1%.
About 99 Players will have an RTP in the range 97.47 +/- 3 SD = 83.1% to 111.9%.
Roughly 1 Player out of 100 will have an RTP below 83.1% or above 111.9%."

The standard deviation is proportional to the square root of the number of games, so the SD of 2000 spins is about 214.6 (units). As a percentage of the amount wagered, it is about 10.7%.
 
Jufo, your apology is immediately accepted. :)


There is an error in this page:

The standard deviation is proportional to the square root of the number of games, so the SD of 2000 spins is about 214.6 (units). As a percentage of the amount wagered, it is about 10.7%.

GrandMaster,

Thanks for the input. As I said in an earlier thread to Jufo, I appreciate people pointing out any errors. My goal is an accurate publication.

I'm assuming that you got your 214.6 units by the formula: Square Root of 2000 = 44.7 * published SD of 4.8 = 214.6. 214.6 / 2000 ~= 10.7%.

Given the back-and-forth that Jufo and I have been doing, I was also curious as to what minimum sample size (other than the 2,000 rounds currently used) would be required in order to approximate the breakdown presented in the Help file's example.

(I ran these experiments and crunched these numbers yesterday, before I read your post. Jufo's sim has the reel layout for my Hot Peppers slot, so I did the following experiment using this slot rather than the Crazy 8 Line slot referenced in the Help file. However, please note: the SD for Crazy 8 Line, at 4.8, is almost exactly the same as Hot Peppers at 4.7.)

If you've read through this thread, you're aware that I have a table in my database containing 20 million rounds of Hot Peppers. For no reason in particular, I picked 5,000 rounds as a place to start. So, I did what was needed to get the RTP for the first 5,000 rounds, then the next 5,000 rounds, and like that.

(This was kind of tedious, so I did it for only 50 sets rather than the 100 mentioned in the Help file's example.)

Here is the data: Hot Peppers has a Theoretical RTP of 97.48 and a Standard Deviation (SD) of 4.7.

[table="width: 600, class: grid, align: left"]
[tr]
[td]-3 SD[/td]
[td]-2 SD[/td]
[td]+/-1 SD[/td]
[td]+2 SD[/td]
[td]+3 SD[/td]
[/tr]
[tr]
[td]83.38%[/td]
[td]88.08%[/td]
[td]92.78% to 102.18%[/td]
[td]106.88%[/td]
[td]111.58%[/td]
[/tr]
[tr]
[td]86.51[/td]
[td]88.14[/td]
[td]93.23[/td]
[td]104.16[/td]
[td]108.64[/td]
[/tr]
[tr]
[td]86.55[/td]
[td]88.66[/td]
[td]94.22[/td]
[td]106.48[/td]
[td][/td]
[/tr]
[tr]
[td]87.02[/td]
[td]90.05[/td]
[td]94.26[/td]
[td]106.86[/td]
[td][/td]
[/tr]
[tr]
[td]87.69[/td]
[td]90.29[/td]
[td]94.38[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td]87.87[/td]
[td]90.51[/td]
[td]94.40[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]90.62[/td]
[td]94.51[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]91.05[/td]
[td]94.65[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]91.10[/td]
[td]94.74[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]91.11[/td]
[td]94.75[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]92.08[/td]
[td]95.39[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]92.10[/td]
[td]95.62[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]92.22[/td]
[td]96.33[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td]92.28[/td]
[td]96.42[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]96.57[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]97.26[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]97.37[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]98.08[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]98.28[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]98.63[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]99.21[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]99.37[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]100.03[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]101.18[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]101.21[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]101.50[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]101.86[/td]
[td][/td]
[td][/td]
[/tr]
[tr]
[td][/td]
[td][/td]
[td]101.88[/td]
[td][/td]
[td][/td]
[/tr]
[/table]

There was one outlier - it was in the +4 SD range at 112.85.

So, a sample size of 5,000 falls reasonably well into the +/3 SD range using an SD of 4.7.

I am going to use my admittedly very limited understanding of your previous post, and perform a similar calculation.

Square Root of 5000 = 70.7 * published SD of 4.8 = 339.4. 339.4 / 5000 ~= 6.8%.

Backing your formula up would result in a sample size of 10,000 yielding an SD of 4.8. However, although 10,000 rounds gives the more precise result, 5,000 rounds is not exactly out of the ballpark either.

I know that when the issue of Theoretical RTP comes up, often the response is "Oh, you need to play a gazillion games before you'll ever see anything approaching that value.", and like that. I'm being facetious, but I'm sure you get my point. A Player can begin to see the statistics come into play with significantly less rounds than a gazillion.

I am thus left to conclude that where you state "There is an error in this page:", you are referring to the sample size of 2,000. If true, I agree. I'll perform some calculations, and run some verification tests, but this number needs to be changed.

Chris
 
Chris,

I think grandmaster meant that the variance for x spins is not the same as the variance for 1 spin.

The statement he quoted from the original page is indeed completely wrong .. like applying oranges to apples so to speak.

If the SD of a single game is 4.8 (or 480%)

then the SD of 2000 games is sqrt(2000)*4.8/2000 = 0.1073 (or 10.7%)

So the numbers given on that page (68% will ... ) .. are wrong.

(they should be :
About 68 Players will have an RTP in the range 97.47 +/- 1 SD = 86.77% to 108.17%.
About 95 Players will have an RTP in the range 97.47 +/- 2 SD = 76.07% to 118.87%.
About 99 Players will have an RTP in the range 97.47 +/- 3 SD = 65.37% to 129.57%.
Roughly 1 Player out of 100 will have an RTP below 65.37% or above 129.57%.
)

i.e.

if you bet $1 and win $5 that would be just over 1 SD from average
if you bet 2000 spins of $1 and walk away with $10000 that would be almost 50 SD from average
(note - both these sessions have the same RTP .. )

Note that although the variance tells you much more about a slot than the RTP does, it is just a single number with
not much more meaning than the average a win will be away from the RTP. It doesn't tell you whether or not this
is with one huge win, or a couple big ones or .. etc ..

in fact, using the normal distribution to approximate the actual distribution of your typical slotmachine is a practice
that requires a lot of insight. For example, on the left side of the curve, this is a fairly valid approximation,
however on the right side (i.e. the winners) .. you will see a lot of exceptions since most bigger wins will be many,
many SD's away from average ..

Determining how many spins is needed for a 'fair' representation is even more difficult and the answer will
be different for each machine. A machine with 99.999 losing spins and one big winner would need a very large
amount of spins .. In general you could use a number of 'rules of thumb', one I use would be 30 times the
feature frequency .. since the feature often includes a very large amount of the payout, any representative
sample would need to include enough features. So if the feature gets hit once every 100, then 3000 spins
would be my rule-of-thumb answer.

Enzo
 
There is an error in this page:

"For example - using the Crazy 8 Line slot, with a Theoretical RTP of 97.47%, and a Standard Deviation (SD) of 4.8:
Let's say that we have 100 Players, each of whom has played 2,000 rounds of Crazy 8 Line. The Player RTP results that we might expect to see are:

About 68 Players will have an RTP in the range 97.47 +/- 1 SD = 92.7% to 102.3%.
About 95 Players will have an RTP in the range 97.47 +/- 2 SD = 87.9% to 107.1%.
About 99 Players will have an RTP in the range 97.47 +/- 3 SD = 83.1% to 111.9%.
Roughly 1 Player out of 100 will have an RTP below 83.1% or above 111.9%."

The standard deviation is proportional to the square root of the number of games, so the SD of 2000 spins is about 214.6 (units). As a percentage of the amount wagered, it is about 10.7%.

A very good spot. In fact I spotted this yesterday too, when I started to make calculations to compare the normal distribution approximation to actual distribution of results.

It looks like the above values were obtained by simply assuming that 1 SD range equals the unit standard deviation (4.8) value as a percentage: (102.3% - 92.7%) / 2 = 4.8%. Chris, this is obviously incorrect. First of all SD values are not percentages (4.8 != 4.8 %) and like Enzo pointed out the range of results for +/- 1 SD always depends on the number of spins. Now I see the confusion from our previous discussions where you thought that standard deviations are ratios like RTP. They are not.

To clarify, below is the calculation you need to do to get the +/- 1 SD range for 2000 spins:

SD_TOTAL = SD_UNIT*BET_SIZE*SQRT(SPINS)

2000 spins, $1 bet size, SD_UNIT = 4.8, gives:

SD_TOTAL = 4.8*$1*SQRT(2000) = $214.67 (1 standard deviation for 2000 spins)

House edge across 2000 spins is 2000*$1*(1-0.9747) = $50.6

So the 1 SD range would be:

-HE-SD_TOTAL .... -HE+SD_TOTAL
-$50.6-$214.67 ... -$50.6+$214.67 = -$265.26 ... +$164.06

When you divide the amounts by total wagered ($2000) you get the RTP range for 1 SD as:

86.74% ... 108.20%

You can work out the rest of the values the same way.

Enzo said:
(they should be :
About 68 Players will have an RTP in the range 97.47 +/- 1 SD = 86.77% to 108.17%.
About 95 Players will have an RTP in the range 97.47 +/- 2 SD = 76.07% to 118.87%.
About 99 Players will have an RTP in the range 97.47 +/- 3 SD = 65.37% to 129.57%.
Roughly 1 Player out of 100 will have an RTP below 65.37% or above 129.57%.
)

Putting the numbers to Excel and avoiding any rounding I get the following numbers which, apart from the last row, are very close to yours:

About 68 Players will have an RTP in the range 97.47 +/- 1 SD = 86.74% to 108.20%.
About 95 Players will have an RTP in the range 97.47 +/- 2 SD = 76.00% to 118.94%.
About 99 Players will have an RTP in the range 97.47 +/- 3 SD = 65.27% to 129.67%.
Roughly 1 Player out of 100 will have an RTP below 72.50% or above 122.44%

The last row disagrees with yours because +/- 3 SD isn't the same as "1 Player out of 100". Instead you need the calculate the RTP at +/- 2.32635 SD,
because in Excel: NORMDIST(-2.32635) = 1% and NORMDIST(2.32635) = 99%.
 
Enzo,

Thank you for this considered response. As I mentioned, my objective here is to provide an accurate publication.

As best I can determine, those with knowledge that exceeds my own on this subject are OK with everything on this page except for the number of sample spins defined in the Standard Deviation Definition of Terms section.

I believe that the above statement is True, and would appreciate anyone posting "False" if otherwise.


Using The equation referenced in both your post and in GrandMaster's post:

(sqrt(X) * 4.8) / X = 4.8.

resulted in a spin count of ~10,000. So, I just modified the file to use this number.

For Hot Peppers, which has a Free Spin Frequency of 1 every 75 spins, this would result in a sample count of 2,250 rounds. I would be reluctant to use that sample count, given that the previous table of data was for this slot and a sample count of 5,000 could perhaps be most accurately described as "marginal".

Again Enzo, thanks for the input.

I bring my first statement to everyone's attention and repeat my request that if it is not True then please post "False". (Actually, if you could give me a clue as to what is "False", that would also be appreciated.)

Chris
 
Enzo,

Thank you for this considered response. As I mentioned, my objective here is to provide an accurate publication.

As best I can determine, those with knowledge that exceeds my own on this subject are OK with everything on this page except for the number of sample spins defined in the Standard Deviation Definition of Terms section.

I believe that the above statement is True, and would appreciate anyone posting "False" if otherwise.


Using The equation referenced in both your post and in GrandMaster's post:

(sqrt(X) * 4.8) / X = 4.8.

resulted in a spin count of ~10,000. So, I just modified the file to use this number.

Yes, your RTP ranges are correct for 10,000 spins because SQRT(10000)*4.8/10000 = 100/10000*4.8 = 4.8% so the standard deviation parameter transforms into a percentage.

The last row is still not correct: "Roughly 1 Player out of 100 will have an RTP below 83.1% or above 111.9%."

You took those RTPs from +/- 3 SD range above but 3 Standard deviations equals a chance of 1 in 740, not 1 in 100. The Standard deviation point for 1 in 100 chance equals 2.32635 SD, not 3 SD.

For Hot Peppers, which has a Free Spin Frequency of 1 every 75 spins, this would result in a sample count of 2,250 rounds. I would be reluctant to use that sample count, given that the previous table of data was for this slot and a sample count of 5,000 could perhaps be most accurately described as "marginal".

What do you mean by sample count of 2250 rounds above? Free spins are included in the payout of the triggering spin, so 10 000 spins doesn't include free spins, you need to have 10 000 spins where free spins are not counted towards the total.
 
OK guys, I'm trying hard to keep up here.

Because of Jufo's sim and the Hot Peppers reel layouts, most of the specific data in this thread has been about Hot Peppers.

So, I just modified the SD section of the file to reference Hot Peppers.

And, as I mentioned in my earlier post, I changed the sample count from 2,000 rounds to 10,000 rounds.

At the top of this section of the Definition of Terms, I underline the words "given a reasonably large sample size". This is your generic "out clause" for someone that doesn't want to get too specific about exactly what kind of sample size is required and how to calculate it.

I included this "out clause" because I had no freakin' idea how to calculate it. :)

Now that I have been provided with some specific recommendations on how to do that, my concern in providing this additional data is making this file any larger and more dense than it already is. (I think I mentioned in some earlier thread that it was already weighing in at about 500 pounds.)

But, then again, the probability that anyone is actually going to read this page is so vanishingly small, going from 500 pounds to 550 pounds may not be an issue. (Just as with the previous post on Game RTPs, ultimately this thread too will begin its eventual decay into CM's historical thread pile, doing its best impression of "lining the bottom of the bird cage".)

Chris
 
If you've read through this thread, you're aware that I have a table in my database containing 20 million rounds of Hot Peppers. For no reason in particular, I picked 5,000 rounds as a place to start. So, I did what was needed to get the RTP for the first 5,000 rounds, then the next 5,000 rounds, and like that.

Just a heads up, that you can also do this more efficiently with the simulator. Set it to play 5000 rounds and to repeat this 10 000 times, recording the RTP from each simulation of 5000 spins. Then check how the RTPs from the simulation fall into the percentage ranges in your help file. Ok, I can do it...
 
The last row is still not correct: "Roughly 1 Player out of 100 will have an RTP below 83.1% or above 111.9%."

You took those RTPs from +/- 3 SD range above but 3 Standard deviations equals a chance of 1 in 740, not 1 in 100. The Standard deviation point for 1 in 100 chance equals 2.32635 SD, not 3 SD.

What do you mean by sample count of 2250 rounds above? Free spins are included in the payout of the triggering spin, so 10 000 spins doesn't include free spins, you need to have 10 000 spins where free spins are not counted towards the total.

Jufo,

I was trying to avoid referring to a fractional person. That is "Approximately 0.3 out of 100 Players ...". I agree, the way that it is currently written may not be statistically accurate, but at least it does not involve any cutting implements.

Enzo, in his "rule of thumb" suggestion, indicated that 30 times the feature frequency (which for Hot Peppers is the Free Spin frequency at 1 in every 75 spins = 2,250 rounds) is one method of calculating the minimum sample size required to approximate the slot's statistics.

Chris
 
Jufo,

I was trying to avoid referring to a fractional person. That is "Approximately 0.3 out of 100 Players ...". I agree, the way that it is currently written may not be statistically accurate, but at least it does not involve any cutting implements.

Why not then use the RTP that correndspons to 1 in 100 event, which is the point 2.32635 in standard deviation curve. To me it sounds misleading to say that unluckiest 1% ends up with RTP lower than 83.1% when such low RTP is almost ten times rarer than that. I thought you wanted the page to be accurate and perfect?

Enzo, in his "rule of thumb" suggestion, indicated that 30 times the feature frequency (which for Hot Peppers is the Free Spin frequency at 1 in every 75 spins = 2,250 rounds) is one method of calculating the minimum sample size required to approximate the slot's statistics.
Chris

I see. I doubt 2,250 rounds is going to be nearly enough like I will show you in a short while. In Video Poker the results will be normally distributed once you have hit the top payout (Royal Flush) enough times, so with a 1:40 000 frequency for Royal Flush, it takes ~200 000 rounds until the distribution of results is normally distributed. It might be that you need similar very large number of spins in slots too, to expect to hit the largest payouts at least a few times.
 
Note that although the variance tells you much more about a slot than the RTP does, it is just a single number with
not much more meaning than the average a win will be away from the RTP. It doesn't tell you whether or not this
is with one huge win, or a couple big ones or .. etc ..

in fact, using the normal distribution to approximate the actual distribution of your typical slotmachine is a practice
that requires a lot of insight. For example, on the left side of the curve, this is a fairly valid approximation,
however on the right side (i.e. the winners) .. you will see a lot of exceptions since most bigger wins will be many,
many SD's away from average ..

Determining how many spins is needed for a 'fair' representation is even more difficult and the answer will
be different for each machine. A machine with 99.999 losing spins and one big winner would need a very large
amount of spins .. In general you could use a number of 'rules of thumb', one I use would be 30 times the
feature frequency .. since the feature often includes a very large amount of the payout, any representative
sample would need to include enough features. So if the feature gets hit once every 100, then 3000 spins
would be my rule-of-thumb answer.
You may need the exact probabilities of the various outcomes to get a decent estimate, but I will ask one of the statisticians at work if I remember. They may have some kind of approximation for this.
 
Why not then use the RTP that correndspons to 1 in 100 event, which is the point 2.32635 in standard deviation curve. To me it sounds misleading to say that unluckiest 1% ends up with RTP lower than 83.1% when such low RTP is almost ten times rarer than that. I thought you wanted the page to be accurate and perfect?

LOL. Aw c'mon Jufo, give me a break, huh? I'm still actually trying (although at this point I concede that it is probably a futile effort) to write a file that can actually be understood by someone whose knowledge of all of this at least approaches that of the average person. :)

Chris
 
I will now compare how well the normal distribution interpretation mentioned in the help file and referenced here matches with the actual distribution of results. I'll choose the Hot Pepper slot (as there is a simulator for it) with max. lines and 2000 spins for each result. I use 2000 here because this was the value originally mentioned in the help file.

Hot Pepper slot has SD of 4.7 and T-RTP of 97.48% so below is what the normal distribution would predict the results to be.

Code:
1 SD range (2000 rounds): 10.51% 

+/- 1 SD RTP Range: 86.97% to 107.99%
+/- 2 SD RTP Range: 76.46% to 118.50%
+/- 3 SD RTP Range: 65.95% to 129.01%

RTP of Median player: 97.48%

Probability of being ahead (RTP >100%) after 2000 spins: 40.52%

RTP of the worst 1%: 73.03% or lower
RTP of the highest 1%: 121.93% or higher

Next I will run a simulation of 2000 spins with the same settings, which is repeated 10 000 times. The RTPs from those 10 000 results can be then compared with the above values.

The stats from the 10 000 simulated runs of 2000 spins are:

Code:
+/- 1 SD RTP Range: 87.92% to 106.68%
+/- 2 SD RTP Range: 80.18% to 120.33%
+/- 3 SD RTP Range: 73.50% to 174.42%

RTP of Median player: 96.69%

Probability of being ahead (RTP >100%) after 2000 spins: 36.51%

RTP of the worst 1%: 77.70% or lower
RTP of the highest 1%: 127.28% or higher

Compare these numbers with the ones above.

Below is the plot of the results. The x-axis is RTP (1% increments) and y-axis is the frequency count for that RTP. In red are the frequencies expected by the normal distribution assumption and in blue are the actual frequencies obtained by simulation.

normal_compare.webp

A few notable differences from normal distribution:

The median and mode of actual results are less than what is predicted by normal distribution.

Normal distribution expects higher frequencies for low RTPs than what actually occurs. The lowest RTP in the sample data was 69.91%. Normal distribution approximation expects there to be 44 samples with lower RTP than that (out of 10,000).

In contrast at the high RTP end there were many more results than what normal distribution expects. Note that the X-axis changes to 10% increments at 140% RTP to save space (the right side of vertical line). According to normal distribution, the highest possible RTP in a sample of 10,000 is 136.57%. In the actual data there were 52 samples with total RTP higher than that. In addition the standard deviations of the top 10 results were (the top 1/1000th):

7,62
7,68
7,72
7,79
7,79
8,31
8,45
8,63
8,93
9,05

Obviously it is not possible have results with standard deviations more than Seven in a sample of 10,000, let alone 10 such results. This means that normal distribution doesn't handle rare big wins well, even within 2000 spins.

Let's check the statistical tests for normality. I chose the
You do not have permission to view link Log in or register now.
normality test as it should be quite lenient in accepting the null hypothesis for normality. The result was:

QQplot.webp

Normality test very clearly rejects the null hypothesis of normal distribution. The plot above (Q-Q plot) shows the deviation from normality very clearly. If the results were normally distributed they would stay close to the red straight line. Now there is considerable deviation at both the negative SD (low RTP) and the high SD (high RTP) ends.

Based on the above study, I'd conclude that: 2000 spins isn't nearly enough to estimate the range of results by normal distribution. Given that this was a lower variance slot, this applies even more more to higher-variance slots. The differences between actual probabilities and estimated probabilities are several percent even in the parts where the data matches normal distribution well. Normal distribution vastly underestimates the frequency of big wins and high RTPs, such as the several +7 SD ... +9 SD outcomes in the data. It also overestimates the frequency of low RTPs.
 
Last edited:
LOL. Aw c'mon Jufo, give me a break, huh? I'm still actually trying (although at this point I concede that it is probably a futile effort) to write a file that can actually be understood by someone whose knowledge of all of this at least approaches that of the average person. :)
Chris

My beef was that I hate it when the casino underestimates the odds of a rare loss. So I hope at least you put it right.
 
Jufo,

First of all, your post #92 in this thread is the most impressive post that I have ever seen at Casinomeister, both in content and in presentation.

The 2 charts/graphs at the bottom present a much more grim picture of the data than the 2 "blocks" of text data nearer the top. Just going by the "text blocks", the less knowledgeable reader (that is, me) might say, "Well, they're certainly off, but the differences are not overwhelmingly different". But the charts present a more grim picture.

It's interesting that your data supports something mentioned by Enzo - significant deviations from a normal distribution will occur at the high end, the winning end, of the curve because of the rare but significant "big winners." However, it also shows a significant deviation, in the other direction, at the lower end of the curve.

So, 2,000 as a sample size for a slot with an SD of 4.7 is clearly a problem. I'm left to wonder whether any further conclusions can be drawn from the data because of this.

Chris
 
Jufo,

First of all, your post #92 in this thread is the most impressive post that I have ever seen at Casinomeister, both in content and in presentation.

Lol, thanks. Given the amount of work and research I put in to be able to produce that post, I'd respond: "Yeah, it better be" ;)

The 2 charts/graphs at the bottom present a much more grim picture of the data than the 2 "blocks" of text data nearer the top. Just going by the "text blocks", the less knowledgeable reader (that is, me) might say, "Well, they're certainly off, but the differences are not overwhelmingly different". But the charts present a more grim picture.

It's interesting that your data supports something mentioned by Enzo - significant deviations from a normal distribution will occur at the high end, the winning end, of the curve because of the rare but significant "big winners." However, it also shows a significant deviation, in the other direction, at the lower end of the curve.

Yes, I recognized from Enzo's post that he clearly knows a lot about the subject. I'd guess that he has studied the same things as I did in the above post. Yes, the tails of the distribution curve are the problematic part. It's actually very common in science that the middle part of the data follows normal distribution well but deviates from it at the tails. The problem is that the tails are usually the most interesting part: in this case large losses and large wins. It's important to know that normal distribution cannot handle these well and considers the largest wins to be nearly +10 SD outcomes.

A paranoid casino manager (who is paranoid of winners) might incorrectly apply normal distribution to slot results and think there is something wrong their games because of those frequent +10 SD outcomes :D

So, 2,000 as a sample size for a slot with an SD of 4.7 is clearly a problem. I'm left to wonder whether any further conclusions can be drawn from the data because of this.

I'll redo the previous analysis with 10,000 spins to see if there if there is improvement and report here shortly.
 
Jufo,

Understood and agreed about the tails, both tails.

However, if I understand Enzo correctly, he indicated that it was the high-end tail that would experience the greater number of "outliers" with a slot RTP distribution, because of the "big wins".

Losses, the low-end tail, are more controlled, in a sense, because of the relative limitation in the range of the available bet amounts. Wins, however, especially those rare and really big wins, are going to screw with the high end.

HOWEVER, all that said, the gross impact might be small in the grand scheme of things. That is, you plotted 10,000 samples, each of 2,000 games. You wound up with 52 samples whose RTP exceeds expectations, or 0.5% of the samples, which in the grand scheme of things is not that large a number.

I'll be interested in seeing your 10,000 sample size runs. The value of your sim is obvious. It would take me 4 or 5 days, maybe more, to do 10,000 sample sets. I don't have that time.

BTW, I modified the sample size referenced in the Help file from 100 to 1,000. The final line now reads:

"Only 3 Players in a thousand ..."

Chris
 
Ok, I now re-did the sim with 10,000 runs of 10,000 spins (this equals 100 million spins in total so the simulator took about half an hour to play all those spins...)

The most visible improvement with increase to 10,000 spins is cutting off the most exreme high RTPs. In the previous simulation of 2000 spins the highest RTP in the data (out of 10,000 samples) was ~194%. Since the range for 1 SD was 10.51% (=4.7*SQRT(2000)/2000), this RTP equals (194% - 97.48%)/10.51% = 9.2 SDs. Nine standard deviations is of course impossible.

With 10,000 spins the highest RTP (out of 10,000 samples) was 125.43%. Since the range for 1 SD is now 4.7% (=4.7*SQRT(10000)/10000), this RTP equals (125.43% - 97.48%)/4.7% = 5.9 SDs.

5.9 SD equals odds ~1 in 750 million, so the normal distribution still fails to set correct odds for the biggest wins but at least this is an improvement from 9 SD.

Below are distribution plot and QQ-plot for 10,000 spins:

hist.webp
qqplot2.webp

If you compare these plots with the previous 2000 spins result, you will notice that the distribution is getting closer and closer to normal distribution. But the normality test (Kolmogorov-Smirnov) still very clearly rejects normality hypothesis. It's not even close.

So, my opinion of the text below in the help file:

For example - using the Hot Peppers slot, with a Theoretical RTP of 97.48%, and a Standard Deviation (SD) of 4.7:
Let's say that we have 1000 Players, each of whom has played 10,000 rounds of Hot Peppers. The Player RTP results that we might expect to see are:

About 682 Players will have an RTP in the range 97.47 +/- 1 SD = 92.7% to 102.3%.
About 954 Players will have an RTP in the range 97.47 +/- 2 SD = 87.9% to 107.1%.
About 997 Players will have an RTP in the range 97.47 +/- 3 SD = 83.1% to 111.9%.
Only 3 Players in a thousand will have an RTP below 83.1% or above 111.9%.


is that normal distribution approximation works very badly for slots. If the above snippet of text was related to Blackjack or even money bets on roulette (each of which has SD close to 1) then even a sample as small as 20 spins/rounds would be perfectly normally distributed. But it doesn't work when there are high payouts involved. Therefore I would either remove the above part of text completely or add a more informative disclaimer, which says something like:

"The number of spins required for the results to be even moderately close to normal distribution is very high and even then it will not be accurate for low or high return percentages. The result will also be more unreliable for higher variance slots."
 
Last edited:
Understood and agreed about the tails, both tails.

However, if I understand Enzo correctly, he indicated that it was the high-end tail that would experience the greater number of "outliers" with a slot RTP distribution, because of the "big wins".

Losses, the low-end tail, are more controlled, in a sense, because of the relative limitation in the range of the available bet amounts. Wins, however, especially those rare and really big wins, are going to screw with the high end.

Yes, it's pretty much like you said above. Both tails deviate from normal distribution but the right tail deviates from it more because of those rare big hits. The left tail deviates from normal distribution because the maximum you can lose in every round is your bet. However normal distribution doesn't know this: the SD of 4.7 per spin indicates that a +/- 1 SD range is either a win or loss of 4.7 units, but you can never lose 4.7 units by betting 1 unit, can you? So like you said this means that the losses are "more controlled" than what the normal distribution estimates them to be.

In other words, normal distribution assumes wins and losses to be symmetrical where they are inherently not (it's possible to lose only 1 unit but to win XX units in every spin). The result will be normally distributed only after you have played so many spins that this asymmetry has "dissipated" under the large number of spins.

You could describe the right tail like this: Normal distribution becomes valid only when there are enough spins such that a player who has had a single big hit is indistinguishable from all other players in terms of his overall RTP. In other words the peak caused by the big win gets "dissipated" in the large number of spins.

HOWEVER, all that said, the gross impact might be small in the grand scheme of things. That is, you plotted 10,000 samples, each of 2,000 games. You wound up with 52 samples whose RTP exceeds expectations, or 0.5% of the samples, which in the grand scheme of things is not that large a number.

52 samples out of 10,000 exceeded the maximum possible payout estimated by normal distribution. But even below that point there were discrepancies with actual and expected frequencies. I checked that they started at around 130% RTP mark. This equals that around ~1% of top payouts are problematic.

I'll be interested in seeing your 10,000 sample size runs. The value of your sim is obvious. It would take me 4 or 5 days, maybe more, to do 10,000 sample sets. I don't have that time.

Yeah, when in post #81 you started to calculate RTPs of samples of 5,000 one at a time, I was like: don't do that, my sim can do it 10 000 times in one click. Also, doing only 50 sets of 5,000 and seeing that they happen to fall within +/- 3 SD range is not very convicing, ie. you can't draw the conclusion that the data matches the hyporhesis. That's why I made the "heads up" post #87.

BTW, I modified the sample size referenced in the Help file from 100 to 1,000. The final line now reads:

"Only 3 Players in a thousand ..."

Ok, that sounds fine.

However, I checked that the true odds to end up with RTP lower than 83.1% in 10k spins is of the order ~1/10000 rather than ~1/667 implied by statement "3 in a thousand (divided by two)". Normal distribution is again very inaccurate here as this is a probability related to a tail.
 
Last edited:

Users who are viewing this thread

Accredited Casinos

Read about our rating system and how it's done.
Back
Top