# Basic Excel Business Analytics #65: Create Random Variables: Custom, Binomial, Uniform, Normal

Welcome to Highline BI348

Class Video Number 65. If you want to

download this workbook, BI348Chapter11Video065 and

follow along, click on the link below the video. Last video we’re talking

about creating a Monte Carlo simulation. This video we want to talk about

creating random or uncertain variables in Excel

for, of course, our Monte Carlo simulation. Now we’re going to see four

different variables based on four different distributions. We’ll look at our relative

frequency distribution variable, binomial

distribution variable, uniform, and normal

distribution variable. Now remember, before we can

create random variables, we must have a good

estimate of what the probability distribution

is for that variable. So in our first

example, we’ll go over two discrete probability

distribution sheet. Here is our variable

we need to randomize, direct labor cost per unit. Now this is a

discrete variable, and from a historical relative

frequency distribution– that means from the

accounting department we have estimated

these probabilities for these direct labor costs. Now in the process

we rounded them off, so we have an actual 39,

then a 40, then a 41. So this is a discrete

distribution. Actually, last video

we had a cost example. And we didn’t round it off. And we treated it as a

continuous uniform variable. But this one we have a different

set of probabilities and it is a discrete variable. So how in the world do we take

this, a relative frequency distribution or a

probability distribution, and use it to create

a random variable? Well, we’re first going to

look at the RAND function. Last video we looked

at RAND between. Here we’re going

to look at RAND. Both of these functions

use a uniform distribution for randomly generating numbers. Now, this uniform

distribution for RAND won’t work straight

out of the box, but we’ll do something

interesting in the formulas down here to get this

probability distribution to show up in our formula. Now what does the

RAND function do? I actually want to

backspace, because this is an argumentless function. You don’t put anything in

between the open and the closed parentheses. The RAND function

will automatically generate a random number when

I Control Enter, between 0 and 1– 0 is included, 1 is

not– up to 15 digits long. Now I don’t have all

the digits showing here. You could increase or decrease. But there’s 15 digits there. That means like 999

trillion possible values. And it is selecting amongst

all those values in accordance with a uniform distribution. And every time I

hit the F9 key, you can see the formula

is randomizing, selecting a number

between 0 and 1. So this by itself is

not going to help us with this distribution. But we want to get clever here. We want to convert this

into a lookup table and have this RAND

as our lookup value. Now let’s just hit

F9 here, F9 here. 99. One more, 32. If I were to add, let’s say,

these first three probabilities up, this would be 45. So 32. If I were to use this

as a lookup value and have a column with

cumulative probabilities, it would say 0.1, 0.25, 0.45. So this would hit the first

bigger one and jump back. And it would select that 40. Now I want to go ahead and

create this extra column here, because that’s the

trick for using a historical relative

frequency distribution. Now the first value has to be 0,

because when you have a VLOOKUP you have to have the

smallest possible value. Now this value right

here has to be 0.1. This value right here

has to be the sum of both of these, which is 0.25. So watch this. I’m going to get tricky. I’m going to use

the Sum function with an expandable range. I’m going to click

in that cell C5. Colon gives me the second C5. Close parentheses. Put your cursor right in

the first cell reference, F4 to lock it. That means the

first C5 is locked but the second one is not. Because I’m copying

down across rows, that 5 will turn to 6,

then 7, successively adding and creating a running

total or a cumulative total. Control Enter. And copy it down. To see that expandable range,

select any cell and F2. You can see the 5 moved to 6,

and the range is expanding. If I come down to the end, boom. That’s to 10. That has expanded the whole way. So for the first column

of our lookup table, we need our cumulative total. Now we can see

explicitly if this is our first column

for VLOOKUP or LOOKUP, because we’re doing

an approximate match. That’s the first column. This is the second

column with the thing to go and get and

bring back to the cell. Notice 0.400 somewhat. It comes racing down, bumps

into the first value bigger, jumps back. And it would go and get the 41. Now lookup functions and

VLOOKUP with approximate match don’t literally search through,

bump into the first bigger one, and jump back. That’s just a metaphor

for understanding how a binary search works. Let’s try this. Let’s us go ahead. And by all means, if you want

to use VLOOKUP and use the RAND as the lookup value, the

lookup table as this and then comma 2 for a column index. But we’re going to end up

doing a lot of formulas for our simulations. And I don’t want to have to

type the extra column here. So if we know we’re doing

approximate match lookup, I’m just going to use LOOKUP. And actually we covered

this topic extensively back in Chapter 7. So I’m using Equals LOOKUP. And I also have a slide

in the PowerPoints, if you download them, about

how LOOKUP function works, and how it’s different

than VLOOKUP. So I’m going to look this up. There’s the lookup value. Comma. And the array is the one I want. First column, second column. Now the beautiful

thing about LOOKUP is it will always select a

value from the last column in the array table. It happens to be two

columns, so of course it’s going to get that. That’s all I have to put. Close parentheses

and Control Enter. There’s our random value. So what did it do? It took 11%. It raced down. It bumped into the

first bigger one, jumped back, and got the 40%. If I hit F9 key to

randomize 36, it raced through, bumped

into the first bigger one, jumped back, got the 41. I could keep going, randomizing. You can also see a

visual over here. So LOOKUP is taking the

cumulative probability all the way up to the

42, stopping there. And so it’s getting the 42. F9. Here it’s getting the 41. I’m hitting F9 a bunch. Now it’s getting the 43. Now I’m going to add a

little formatting over here. And I followed my convention. The cells that have something

typed into them have no fill. The ones that have a

formula have the fill. And that’s how we can create

a random variable based on a custom relative frequency

distribution or estimated probabilities. Now when we get to

our formulas later, we’re not going to do it

in two separate cells. We’re simply going to go LOOKUP,

RAND, argumentless function. Comma. The ARRAY. Highlight first

and second column. It will automatically get

it from the second column. And Control Enter. Now if I hit F9 here, these

two are going to be different. Because that cell is

linked to that RAND. And this one has a RAND

internally, F9, F9. So I’m randomly selecting based

on this frequency distribution. You know, if you

didn’t trust this and wanted to prove that

this formula works, go ahead and we learn how to do

data tables last video, do a data table with 10,000

rows based on that formula. And then do a pivot

table based on it and build your own

probability distribution. And you’ll see that the

simulated value probability table will be almost

exactly equal to this. Now that is our first example. Now let’s go over to discrete

binomial distribution example. Now we have a situation here. We have number of people

who attend a seminar. And that’s going to

be number of trials for our binomial experiment. So 35 people come. A success is going

to be whether or not any one of those

attendees at the seminar becomes a new customer. And the probability of success,

from our past experience for any one attendee becoming

a new customer, is about 6.5%. Now before we can use the

binomial distribution, we have to run our four

tests for whether or not it’s a binomial experiment. Are there are a fixed

number of trials? Yes. Only two outcomes

for each trial? Yes. With either a new

customer or they’re not. So we’ll say yes. Probability of success

stays the same each time? Yes. This will stay the

same each time. Each trial is independent? We will assume that that’s

true for this experiment. If all of these

are true, then we can use the binomial

distribution. Now we don’t have to get fancy

with VLOOKUP, because there’s a function called

BINOM.INVERSE just like there will be one called NORM

for normal distribution dot inverse. And this actually will

create a value for us based on a probability

distribution. If we give it a

probability BINOM.INVERSE will spit out a variable. The trick is that

the probability we’re going to give it is

from the RAND function. So every time we hit

F9, BINOM.INVERSE will spit out a

random variable based on the binomial distribution. Now I’m going to do Equals

RAND argumentless function. Equals BINOM. And in our prereq class,

we did BINOMDIST a lot. We even saw BINOM.DIST.RANGE. But we actually never

did BINOM.INVERSE. But no problem. It’s relatively

easy to understand. Number of trials,

that is our 35. That means we have

35 chances to convert the attendees to new customers. Comma. The probability for each

one of those attempts, based on past data, is 0.65. And Alpha is the

cumulative probability. So if I put this

0.68 into Alpha, BINOM.INVERSE will think

that we’re going to up to that particular

probability, whatever it is, and it will pick the number

of successes– that’s our x variable– in 35 tries. So when I close parentheses

and Control Enter, for this first F9 or

running of our experiment, we got a random variable of 1. That means we got one

new customer out of 35. If I hit F9, this random time

we got four new customers out of 35 tries. F9, F9. Now you can see that this

distribution, given 0.065, we can see that up past

eight, it’s 0 all the way. So we’re pretty much stuck

getting between about 0 and 7. Now we can put it all

together, BINOM.INVERSE. Remember, DIST always

gives us the probability if we put in an x. INVERSES always gives us the

x, if we put in a probability. And that’s true for all of

the distribution functions. INVERSE gives us the x value. DIST gives us the probability. So I’m going to use the

inverse number of trials. Comma. Probability of

success on any one. And our alpha, RAND,

argumentless function. Close parentheses. Close parentheses. And there we go. So again, these

two are different. That formula right there

is linked to that RAND. This one has an internal RAND. So as I hit F9, that is

our randomizing formula for a binomial variable. Now we want to go over and

look at our third example, continuous uniform

probability distribution. Now we actually did example

in last video of this one. Here is our variable,

material cost per unit. It is a continuous variable. We’re assuming uniform

distribution, the min and the max value. Now unlike last video, I’m going

to show you two different ways. And I kind of like

the first way, and this is the way we

did it last variable. Equals, well, RAND. And RAND between both

use uniform distribution. So I’m using RAND between. The bottom comma and the

top, or the min and the max. Now right now it would treat

this as a discrete variable, if I F9, F9, and you don’t

need to worry about this. This is linked below down here. But as I hit the

F9 key, F9, it’s treating this not

as a continuous but as a discrete variable,

which is fine if you have that. But we’re going to F2, and

just like we did last video, we’re going to multiply

this times 100, multiply this times 100. That will give us 8615. 9233. I need to slide the decimal

over to the left two positions, so F2 divided by 100. And that will be our formula. 9858, F9, 105. Wow. So that’s an example where

we got exactly the max value. Look at that. F9. If you run a simulation,

whatever max value you’ll have or whatever particular

penny value you have, if you did it 10,000

times, each one would be about the

same percentage. Now there’s an alternative. And before Excel 2010, when

this was added automatically, oftentimes we had

to do this one. We had to say

whatever the min is– and actually this is a formula

they show you in the textbook– and we have to add to the Min

the difference between the Min and Max. And you have to

say Max minus Min. Now think about this. What’s the difference? $30 plus $75? If I were to enter

this right now it would always give

us the max value. F2. But what if I multiplied

the difference times 0. Then, of course, I’d

get the Min value. F2. What if I multiplied the

difference between 1. Well, of course I’d

get the max value. Well, that should lead us

right to our conclusion. We can simply use

the RAND function. The RAND delivers a

number between 0 and 1. Control Enter. Now I’m going to

Control Shift tilde to remove the number formatting. If you start hitting F9,

remember that the RAND function has a 15-digit number,

so there’s 15 digits on every single one of these. So we probably

want to round this. We actually want to round

just the second part. That’s already an integer

or a whole number. So round. And then I’m going

to come to the end. Comma. And the number of digits, I

want pennies, so I do two. Close parentheses. Control Enter. Now these are two different

formulas based on a RAND here and a RAND between, so they’ll

be creating different numbers if I hit F9, F9. F9. So that’s how you can

do it for a uniform, either one of those formulas. We have one last

variable to look at, continuous normal

probability distribution. Now our variable we

want to randomize is demand for a

product, continuous. And it is a normal distribution. That means from

our past data we’ve already determined that demand

has a normal distribution. If that’s the case, and we

want to create a randomizing formula, we need to know the

mean and the standard deviation from our past data. And so we do have that. Now again, we’re going to use

the RAND function, which will create a uniform distribution. But now we want to see,

just like for BINOM.INVERSE, there’s a norm for

normal distribution. And inverse means I will

calculate your x value. So you ready? Equals NORM, DIST is

for probability, INVERSE is for our x. There’s the probability. And like all of our

distribution functions, it will calculate from

the smallest value– in our case, negative

infinity– all the way up to the x we’re giving it. So that means if we

give it a probability, it will assume from negative

infinity up to some point. So I can simply

click on this, 77%. That’s going to be

somewhere up in here. Comma. And then we need to

give it the mean. Comma. And the standard deviation. And Lo and behold, now we’re

getting a random variable based on the normal distribution. Now here’s a picture, right? That probability of 0.91, it

goes all the way up to there. And so the NORM.INVERSE

will tell you whatever the x value is. So the demand here

would be 24,470. If I hit F9, F9, you can

see different probabilities from the RAND function get

thrown into NORM.INVERSE and then it spits

out the x value. F9. Now if this is demand, F2

we’re going to use round. Open parentheses. Comma, 0. Close parentheses. Control Enter. Now if I hit F9,

you can see we’re getting all sorts of different

possible random values for our demand. Now remember the

normal distribution, the bulk of probability

is right in here. So it will be highly unlikely

that we get something down here or up here. And of course, way out here,

past three standard deviations in either direction, it’s

going to be incredibly small, like 0.5% out of the total 100%. So if I hit F9,

very rarely are we going to get a value up here. Most of the value is going

to be right by the mean. Now another thing

to consider, when you’re creating a

randomizing formula in Excel, is the relationship between

mean and standard deviation. If this were 10,000, that

means two standard deviations would be below zero,

which is perfectly all right for this distribution

but not necessarily for demand. So you have to think

about your variable. If our demand can

never go below zero, then we’re going to have

to amend the formula. And here’s the entire formula. In essence, we’re

going to have to take whatever this is,

or zero, whichever one is the maximum amount. So the full formula would be,

hey, please give me the Max. And I’m simply going to

put for our number, 0. That’s one of the options for

the Max function to look at. Comma. Or round NORM.INVERSE. The probability is RAND

argumentless function. Comma. The mean, the

standard deviation. There it is. We’re going to close off. And notice, I’m

reading my screen tips, when I’m nesting functions. It’s kind of hard to see where

you are, except for our screen tip will really help us out. So when I close parentheses

I know I’m back in round. I have to get our comma to

get to number of digits zero. I see the screen tip. Close parentheses on the round. That whole number 2, there,

is our randomizing function. Only when this is below

zero will the Max function pick out the zero. So most of the time

with this distribution it’s clearly going to get

mostly positive values. And if we ran 10,000

records– which we’ll do in our next

video for a simulation– we might get some zeros there. Because three

standard deviations out happens once in awhile. You ready? So that’s our full formula. Now again, these will

get two different values, because it has two

different RAND functions. I’m going to Control Shift tilde

to apply general formatting. And there we go. There is our full formula

for demand randomly creating values. So in this video we saw how

to use the normal probability distribution and this

formula to randomly generate demand numbers. Back on continuous uniform we

saw two different formulas, F9, F9, to create random variables

for a uniform distribution for a material cost per unit. Background discrete binomial. If I hit F9, F9, we’re randomly

creating a number of successes for a binomial distribution. And finally, all the way

back on discrete probability distribution, these

formulas here, LOOKUP and RAND

together, randomly creating values from a discrete

probability distribution. Next video we’ll see a

comprehensive new product profit analysis with a bunch

of different uncertain random variables all in one simulation. All right. We’ll see you next video.

## 7 Replies to “Basic Excel Business Analytics #65: Create Random Variables: Custom, Binomial, Uniform, Normal”

amazing , thank you for your always great efforts &info

Amazing videos, congrats from Italy

where can I find the Histogram conditional formatting?

I was wondering already while watching previous video in the series how we can simulate numbers based on custom frequency ( I knew binomial and normal is possible. ) . So the solution is pretty awesome idea !!!!

shouldn't the sum in a5 cell be 0.1 instead of 0?

How do you show the formulats in the cells in the column right of where the calculations are.

I really thank you for your wonderful teaching.. If only the coefficient of variance(COV) is known for a parameter then how to decide upper and lower limit for that particular parameter to generate random numbers. And the parameter is having normal distribution. Kindly Help me in this regard.