Basic Excel Business Analytics #65: Create Random Variables: Custom, Binomial, Uniform, Normal

# Basic Excel Business Analytics #65: Create Random Variables: Custom, Binomial, Uniform, Normal

Welcome to Highline BI348
Class Video Number 65. If you want to
follow along, click on the link below the video. Last video we’re talking
about creating a Monte Carlo simulation. This video we want to talk about
creating random or uncertain variables in Excel
for, of course, our Monte Carlo simulation. Now we’re going to see four
different variables based on four different distributions. We’ll look at our relative
frequency distribution variable, binomial
distribution variable, uniform, and normal
distribution variable. Now remember, before we can
create random variables, we must have a good
estimate of what the probability distribution
is for that variable. So in our first
example, we’ll go over two discrete probability
distribution sheet. Here is our variable
we need to randomize, direct labor cost per unit. Now this is a
discrete variable, and from a historical relative
frequency distribution– that means from the
accounting department we have estimated
these probabilities for these direct labor costs. Now in the process
we rounded them off, so we have an actual 39,
then a 40, then a 41. So this is a discrete
distribution. Actually, last video
we had a cost example. And we didn’t round it off. And we treated it as a
continuous uniform variable. But this one we have a different
set of probabilities and it is a discrete variable. So how in the world do we take
this, a relative frequency distribution or a
probability distribution, and use it to create
a random variable? Well, we’re first going to
look at the RAND function. Last video we looked
at RAND between. Here we’re going
to look at RAND. Both of these functions
use a uniform distribution for randomly generating numbers. Now, this uniform
distribution for RAND won’t work straight
out of the box, but we’ll do something
interesting in the formulas down here to get this
probability distribution to show up in our formula. Now what does the
RAND function do? I actually want to
backspace, because this is an argumentless function. You don’t put anything in
between the open and the closed parentheses. The RAND function
will automatically generate a random number when
I Control Enter, between 0 and 1– 0 is included, 1 is
not– up to 15 digits long. Now I don’t have all
the digits showing here. You could increase or decrease. But there’s 15 digits there. That means like 999
trillion possible values. And it is selecting amongst
all those values in accordance with a uniform distribution. And every time I
hit the F9 key, you can see the formula
is randomizing, selecting a number
between 0 and 1. So this by itself is
not going to help us with this distribution. But we want to get clever here. We want to convert this
into a lookup table and have this RAND
as our lookup value. Now let’s just hit
F9 here, F9 here. 99. One more, 32. If I were to add, let’s say,
these first three probabilities up, this would be 45. So 32. If I were to use this
as a lookup value and have a column with
cumulative probabilities, it would say 0.1, 0.25, 0.45. So this would hit the first
bigger one and jump back. And it would select that 40. Now I want to go ahead and
create this extra column here, because that’s the
trick for using a historical relative
frequency distribution. Now the first value has to be 0,
because when you have a VLOOKUP you have to have the
smallest possible value. Now this value right
here has to be 0.1. This value right here
has to be the sum of both of these, which is 0.25. So watch this. I’m going to get tricky. I’m going to use
the Sum function with an expandable range. I’m going to click
in that cell C5. Colon gives me the second C5. Close parentheses. Put your cursor right in
the first cell reference, F4 to lock it. That means the
first C5 is locked but the second one is not. Because I’m copying
down across rows, that 5 will turn to 6,
then 7, successively adding and creating a running
total or a cumulative total. Control Enter. And copy it down. To see that expandable range,
select any cell and F2. You can see the 5 moved to 6,
and the range is expanding. If I come down to the end, boom. That’s to 10. That has expanded the whole way. So for the first column
of our lookup table, we need our cumulative total. Now we can see
explicitly if this is our first column
for VLOOKUP or LOOKUP, because we’re doing
an approximate match. That’s the first column. This is the second
column with the thing to go and get and
bring back to the cell. Notice 0.400 somewhat. It comes racing down, bumps
into the first value bigger, jumps back. And it would go and get the 41. Now lookup functions and
VLOOKUP with approximate match don’t literally search through,
bump into the first bigger one, and jump back. That’s just a metaphor
for understanding how a binary search works. Let’s try this. Let’s us go ahead. And by all means, if you want
to use VLOOKUP and use the RAND as the lookup value, the
lookup table as this and then comma 2 for a column index. But we’re going to end up
doing a lot of formulas for our simulations. And I don’t want to have to
type the extra column here. So if we know we’re doing
approximate match lookup, I’m just going to use LOOKUP. And actually we covered
this topic extensively back in Chapter 7. So I’m using Equals LOOKUP. And I also have a slide
how LOOKUP function works, and how it’s different
than VLOOKUP. So I’m going to look this up. There’s the lookup value. Comma. And the array is the one I want. First column, second column. Now the beautiful
thing about LOOKUP is it will always select a
value from the last column in the array table. It happens to be two
columns, so of course it’s going to get that. That’s all I have to put. Close parentheses
and Control Enter. There’s our random value. So what did it do? It took 11%. It raced down. It bumped into the
first bigger one, jumped back, and got the 40%. If I hit F9 key to
randomize 36, it raced through, bumped
into the first bigger one, jumped back, got the 41. I could keep going, randomizing. You can also see a
visual over here. So LOOKUP is taking the
cumulative probability all the way up to the
42, stopping there. And so it’s getting the 42. F9. Here it’s getting the 41. I’m hitting F9 a bunch. Now it’s getting the 43. Now I’m going to add a
little formatting over here. And I followed my convention. The cells that have something
typed into them have no fill. The ones that have a
formula have the fill. And that’s how we can create
a random variable based on a custom relative frequency
distribution or estimated probabilities. Now when we get to
our formulas later, we’re not going to do it
in two separate cells. We’re simply going to go LOOKUP,
RAND, argumentless function. Comma. The ARRAY. Highlight first
and second column. It will automatically get
it from the second column. And Control Enter. Now if I hit F9 here, these
two are going to be different. Because that cell is
linked to that RAND. And this one has a RAND
internally, F9, F9. So I’m randomly selecting based
on this frequency distribution. You know, if you
didn’t trust this and wanted to prove that
this formula works, go ahead and we learn how to do
data tables last video, do a data table with 10,000
rows based on that formula. And then do a pivot
table based on it and build your own
probability distribution. And you’ll see that the
simulated value probability table will be almost
exactly equal to this. Now that is our first example. Now let’s go over to discrete
binomial distribution example. Now we have a situation here. We have number of people
who attend a seminar. And that’s going to
be number of trials for our binomial experiment. So 35 people come. A success is going
to be whether or not any one of those
attendees at the seminar becomes a new customer. And the probability of success,
from our past experience for any one attendee becoming
a new customer, is about 6.5%. Now before we can use the
binomial distribution, we have to run our four
tests for whether or not it’s a binomial experiment. Are there are a fixed
number of trials? Yes. Only two outcomes
for each trial? Yes. With either a new
customer or they’re not. So we’ll say yes. Probability of success
stays the same each time? Yes. This will stay the
same each time. Each trial is independent? We will assume that that’s
true for this experiment. If all of these
are true, then we can use the binomial
distribution. Now we don’t have to get fancy
with VLOOKUP, because there’s a function called
BINOM.INVERSE just like there will be one called NORM
for normal distribution dot inverse. And this actually will
create a value for us based on a probability
distribution. If we give it a
probability BINOM.INVERSE will spit out a variable. The trick is that
the probability we’re going to give it is
from the RAND function. So every time we hit
F9, BINOM.INVERSE will spit out a
random variable based on the binomial distribution. Now I’m going to do Equals
RAND argumentless function. Equals BINOM. And in our prereq class,
we did BINOMDIST a lot. We even saw BINOM.DIST.RANGE. But we actually never
did BINOM.INVERSE. But no problem. It’s relatively
easy to understand. Number of trials,
that is our 35. That means we have
35 chances to convert the attendees to new customers. Comma. The probability for each
one of those attempts, based on past data, is 0.65. And Alpha is the
cumulative probability. So if I put this
0.68 into Alpha, BINOM.INVERSE will think
that we’re going to up to that particular
probability, whatever it is, and it will pick the number
of successes– that’s our x variable– in 35 tries. So when I close parentheses
and Control Enter, for this first F9 or
running of our experiment, we got a random variable of 1. That means we got one
new customer out of 35. If I hit F9, this random time
we got four new customers out of 35 tries. F9, F9. Now you can see that this
distribution, given 0.065, we can see that up past
eight, it’s 0 all the way. So we’re pretty much stuck
getting between about 0 and 7. Now we can put it all
together, BINOM.INVERSE. Remember, DIST always
gives us the probability if we put in an x. INVERSES always gives us the
x, if we put in a probability. And that’s true for all of
the distribution functions. INVERSE gives us the x value. DIST gives us the probability. So I’m going to use the
inverse number of trials. Comma. Probability of
success on any one. And our alpha, RAND,
argumentless function. Close parentheses. Close parentheses. And there we go. So again, these
two are different. That formula right there
is linked to that RAND. This one has an internal RAND. So as I hit F9, that is
our randomizing formula for a binomial variable. Now we want to go over and
look at our third example, continuous uniform
probability distribution. Now we actually did example
in last video of this one. Here is our variable,
material cost per unit. It is a continuous variable. We’re assuming uniform
distribution, the min and the max value. Now unlike last video, I’m going
to show you two different ways. And I kind of like
the first way, and this is the way we
did it last variable. Equals, well, RAND. And RAND between both
use uniform distribution. So I’m using RAND between. The bottom comma and the
top, or the min and the max. Now right now it would treat
this as a discrete variable, if I F9, F9, and you don’t
F9 key, F9, it’s treating this not
as a continuous but as a discrete variable,
which is fine if you have that. But we’re going to F2, and
just like we did last video, we’re going to multiply
this times 100, multiply this times 100. That will give us 8615. 9233. I need to slide the decimal
over to the left two positions, so F2 divided by 100. And that will be our formula. 9858, F9, 105. Wow. So that’s an example where
we got exactly the max value. Look at that. F9. If you run a simulation,
whatever max value you’ll have or whatever particular
penny value you have, if you did it 10,000
times, each one would be about the
same percentage. Now there’s an alternative. And before Excel 2010, when
to do this one. We had to say
whatever the min is– and actually this is a formula
they show you in the textbook– and we have to add to the Min
the difference between the Min and Max. And you have to
say Max minus Min. Now think about this. What’s the difference? \$30 plus \$75? If I were to enter
this right now it would always give
us the max value. F2. But what if I multiplied
the difference times 0. Then, of course, I’d
get the Min value. F2. What if I multiplied the
difference between 1. Well, of course I’d
get the max value. Well, that should lead us
right to our conclusion. We can simply use
the RAND function. The RAND delivers a
number between 0 and 1. Control Enter. Now I’m going to
Control Shift tilde to remove the number formatting. If you start hitting F9,
remember that the RAND function has a 15-digit number,
so there’s 15 digits on every single one of these. So we probably
want to round this. We actually want to round
just the second part. That’s already an integer
or a whole number. So round. And then I’m going
to come to the end. Comma. And the number of digits, I
want pennies, so I do two. Close parentheses. Control Enter. Now these are two different
formulas based on a RAND here and a RAND between, so they’ll
be creating different numbers if I hit F9, F9. F9. So that’s how you can
do it for a uniform, either one of those formulas. We have one last
variable to look at, continuous normal
probability distribution. Now our variable we
want to randomize is demand for a
product, continuous. And it is a normal distribution. That means from
our past data we’ve already determined that demand
has a normal distribution. If that’s the case, and we
want to create a randomizing formula, we need to know the
mean and the standard deviation from our past data. And so we do have that. Now again, we’re going to use
the RAND function, which will create a uniform distribution. But now we want to see,
just like for BINOM.INVERSE, there’s a norm for
normal distribution. And inverse means I will
for probability, INVERSE is for our x. There’s the probability. And like all of our
distribution functions, it will calculate from
the smallest value– in our case, negative
infinity– all the way up to the x we’re giving it. So that means if we
give it a probability, it will assume from negative
infinity up to some point. So I can simply
click on this, 77%. That’s going to be
somewhere up in here. Comma. And then we need to
give it the mean. Comma. And the standard deviation. And Lo and behold, now we’re
getting a random variable based on the normal distribution. Now here’s a picture, right? That probability of 0.91, it
goes all the way up to there. And so the NORM.INVERSE
will tell you whatever the x value is. So the demand here
would be 24,470. If I hit F9, F9, you can
see different probabilities from the RAND function get
thrown into NORM.INVERSE and then it spits
out the x value. F9. Now if this is demand, F2
we’re going to use round. Open parentheses. Comma, 0. Close parentheses. Control Enter. Now if I hit F9,
you can see we’re getting all sorts of different
possible random values for our demand. Now remember the
normal distribution, the bulk of probability
is right in here. So it will be highly unlikely
that we get something down here or up here. And of course, way out here,
past three standard deviations in either direction, it’s
going to be incredibly small, like 0.5% out of the total 100%. So if I hit F9,
very rarely are we going to get a value up here. Most of the value is going
to be right by the mean. Now another thing
to consider, when you’re creating a
randomizing formula in Excel, is the relationship between
mean and standard deviation. If this were 10,000, that
means two standard deviations would be below zero,
which is perfectly all right for this distribution
but not necessarily for demand. So you have to think
never go below zero, then we’re going to have
to amend the formula. And here’s the entire formula. In essence, we’re
going to have to take whatever this is,
or zero, whichever one is the maximum amount. So the full formula would be,
hey, please give me the Max. And I’m simply going to
put for our number, 0. That’s one of the options for
the Max function to look at. Comma. Or round NORM.INVERSE. The probability is RAND
argumentless function. Comma. The mean, the
standard deviation. There it is. We’re going to close off. And notice, I’m
reading my screen tips, when I’m nesting functions. It’s kind of hard to see where
you are, except for our screen tip will really help us out. So when I close parentheses
I know I’m back in round. I have to get our comma to
get to number of digits zero. I see the screen tip. Close parentheses on the round. That whole number 2, there,
is our randomizing function. Only when this is below
zero will the Max function pick out the zero. So most of the time
with this distribution it’s clearly going to get
mostly positive values. And if we ran 10,000
records– which we’ll do in our next
video for a simulation– we might get some zeros there. Because three
standard deviations out happens once in awhile. You ready? So that’s our full formula. Now again, these will
get two different values, because it has two
different RAND functions. I’m going to Control Shift tilde
to apply general formatting. And there we go. There is our full formula
for demand randomly creating values. So in this video we saw how
to use the normal probability distribution and this
formula to randomly generate demand numbers. Back on continuous uniform we
saw two different formulas, F9, F9, to create random variables
for a uniform distribution for a material cost per unit. Background discrete binomial. If I hit F9, F9, we’re randomly
creating a number of successes for a binomial distribution. And finally, all the way
back on discrete probability distribution, these
formulas here, LOOKUP and RAND
together, randomly creating values from a discrete
probability distribution. Next video we’ll see a
comprehensive new product profit analysis with a bunch
of different uncertain random variables all in one simulation. All right. We’ll see you next video.

## 7 Replies to “Basic Excel Business Analytics #65: Create Random Variables: Custom, Binomial, Uniform, Normal”

1. Mohammed Suleiman says:

amazing , thank you for your always great efforts &info

2. Bobbo Nelli says:

Amazing videos, congrats from Italy

3. Bobbo Nelli says:

where can I find the Histogram conditional formatting?

4. Tomas Hujo says:

I was wondering already while watching previous video in the series how we can simulate numbers based on custom frequency ( I knew binomial and normal is possible. ) . So the solution is pretty awesome idea !!!!

5. Aamera Khan says:

shouldn't the sum in a5 cell be 0.1 instead of 0?

6. Syed Ali says:

How do you show the formulats in the cells in the column right of where the calculations are.

7. Revanasiddappa Madihalli Svce Civil says:

I really thank you for your wonderful teaching.. If only the coefficient of variance(COV) is known for a parameter then how to decide upper and lower limit for that particular parameter to generate random numbers. And the parameter is having normal distribution. Kindly Help me in this regard.