Well done Marc, nice piece.
One minor quibble -- you say Prize Bonds might work out better than cash deposits if you are in the top tax bracket. But DIRT isn't dependent on your tax bracket, so I wasn't sure what you were getting at there. I suppose there's a dependence on how much you earn from deposit interest, since if you are over the "chargeable" limit you additionally pay 4% PRSI, but this is dependent on the amount of your unearned income rather than your tax bracket.
A second even more minor quibble is about the statement that "the graph is more lumpy toward the left hand side indicating that you are more likely to do badly than to do well". That's not wrong, but it could be caveatted by saying "you are more likely to do badly than to do well
in any given year". In the long run, of course, you're likely to do "averagely". First we have to agree that by "doing well" we mean getting your average return or better. Let's revisit the
€5k graph:
Clearly, the most likely number of
€50 wins (called the
mode of the distribution) is 1, since that number has the highest percentage chance at 36.28% (based on latest figures). The average (or
mean) number of wins over a long period is actually 1.176 -- higher than the mode. (We get the average by just dividing the total number of
€50 prizes by our fraction of the total Prize Bonds held). Now, you're right that in any given year you're more likely to do "badly" (i.e. get less than the mean) because if we sum the first two columns to give the chance of getting either zero or one wins -- both of which are less than the mean -- we get 67.13%, i.e. a 2/3 chance of doing worse than average. On the other hand, if we subtract that number from 100% we get 32.87%, which is the chance of getting any number greater than one. Now, clearly, each year we win
more than one prize, it makes up for a year when we win zero. And we're more likely to win more than one (32.87%) than we are to win zero (30.85%). Not only that, but some of those "more than one" years will be more than two! So it's easy to see why the mean is higher than the mode.
Intuitively, the reason for this phenomenon of the mean (i.e. average) being higher than the mode is because of the skew caused by not being able to win less than zero times, but having a possibility of winning many times more than the mean. (In theory, with a single prize bond, you could win all 470,400 x
€50 prizes in a year, but that would take much longer than the lifetime of the universe on average
).
Now for the science bit. Technically, the type of distribution in our
€5k graph above is called a binomial distribution. You can imagine a smoother version of it where we're allowed to win fractions of prizes. A smoothly continuous version of the binomial distribution is called a
Poisson distribution. That's as distinct from a
normal or
Gaussian distribution -- the well-known bell curve -- of a truly randomly distributed variable. A Gaussian is a symmetrical curve where the mode and the mean are always the same, i.e. your most likely number of wins is identical to your average number of wins. It is to be noted that as the average number of wins increases, the Poisson distribution more and more closely approximates a Gaussian distribution, which is another way of saying what we already concluded: the more you invest the more likely you are to hit the average in any given year. For smaller investments you'll still win the average but over a longer (perhaps very much longer) number of years.
I'll end with some
graphs showing how the Poisson distribution (in blue) differs from the Gaussian (in purple). Here they both are with a mean (vertical grey line) of 1.17, exactly the average number of
€50 wins we expect for our
€5k Prize Bond investment. The mode (which, remember, is the most likely outcome) is the position along the horizontal axis of the highest point on each curve. The mean, on the other hand, is the position on the horizontal axis where the area under the curve to the left of the mean is equal to the area under the curve to its right. For the Gaussian, which is symmetrical, the mean and the mode are always the same -- but we can only achieve this by allowing for the impossible case of negative numbers of wins as can be seen where the purple curve goes negative. But for the Poisson distribution the mode is
less than the mean:
Here is another
case, where the Poisson and the Gaussian once again have the same mean, this time equal to 23.52 -- the average number of
€50 wins for our
€100k investment. See how much more closely the Poisson distribution approximates the Gaussian and the mode is much closer to the mean:
Finally, here are some graphs plucked from [broken link removed] showing a binomial distribution (quite similar to our Prize Bond one) overlaid with a Gaussian with the same mean, for different values of the mean. We see how the binomial distribution, like the Poisson (its continuous version) more nearly approximates the Gaussian as the mean goes up:
[broken link removed]
You can also see that more of the binomial distribution hangs out the left hand side of the area enclosed by the Gaussian because its mode is less than the mean, as we expect. (As Marc puts it, it is "more lumpy to the left", which actually technically is called a
right skew, because the tail to the right is less steep).