4.4.4 Random Variables: Uniform & Binomial: Video

4.4.4 Random Variables: Uniform & Binomial: Video


PROFESSOR: Certain kinds
of random variables keep coming up, so let’s
look at two basic examples now, namely uniform
random variables and binomial random variables. Let’s begin with
uniform, because we’ve seen those already. So a uniform random
variable means that all the values
that it takes, it takes with equal probability. So the threshold variable Z
took all the values from 0 to 6 inclusive, each
with probability 1/7. So it was a basic example
of a uniform variable. And other examples
that come up, if D is the outcome of a fair
die– dies are six-sided. Dice are six-sided. So the probability that it comes
up 1 or 2 or 6 is 1/6 each. Another game is the
four-digit lottery number where it’s supposed to be the
case that the four digits are each chosen at random, which
means that the possibilities range from four 0’s up through
four 9’s for 10,000 numbers. And they’re supposed to
be all equally likely. So the probability that the
lottery winds up with 00 is the same as that it ends up with
1 is the same that it ends up with four 9’s. It’s 1/10,000. So that’s another
uniform random variable. Let’s prove a little lemma
that will be of use later. It’s just some practice
with uniformity. Suppose that I have R1, R2,
R3 are three random variables. They’re mutually independent. And R1 is uniform. I don’t really care
about the other two. I do care technically that they
are only taking the values. They only take values
that R1 can take as well. So I haven’t said
that on this slide, but that’s what we’re assuming. And then I claim is that each
of the pairs, the probability that R1 equals R2– the
event that R1 is equal to R2 is independent of
the event that R2 is equal to R3, which is
independent of the event that R1 is equal to R3. Now, these events overlap. There’s an R1 here and an R1
there and there’s an R2 here and an R2 there. So even though the R1, R2,
R3 are mutually independent, it’s not completely clear. In fact, it isn’t really
clear that these events are mutually independent. But in fact, they’re not
mutually independent. In fact, they’re
pairwise independent. They’re obviously not
three-way independent– that is, mutually
independent– because if I know that R1 is equal to R2 and I
know that R2 is equal to R3, it follows that
R1 is equal to R3. So given these two,
the probability of this changes dramatically
to certainty. So this is the
useful lemma, which is that if I have
the three variables and I look at the three possible
pairs of values that might be equal that whether any
two of them are equal is independent of each other. Now, let me give a
handwaving argument. There’s a more
rigorous argument based on total probability
that appears as a problem in the text. But the intuitive ideas,
let’s look at the case that R1 is the uniform
variable, and R1 is independent of R2 and R3. So certainly, that implies that
R1 is independent of the event that R2 is equal to R3,
because R1 isn’t mutually independent, both R1 and R2. Doesn’t matter what they do, so
it’s independent of this event that R2 is equal to R3. Now, because R1
is uniform, it has probability p of equaling
every possible value that it can take. And since R2 and R3 only take
a value that R1 could take, the probability that R1 hits
the value that R2 and R3 happens to have is still p. That’s the informal argument. So in other words, the claim
is that the probability that R1 is equal to R2
given that R2 is equal to R3 is simply the probability
that R1 happens to hit R2, whatever values R2 has. This equation says
that R1 equals R2 is independent of R2, R3. And in fact, in both cases,
it’s the same probability that R1 is equal
to any given value, the probability
of R being uniform has of equaling each
of its possible values. You can think about that,
see if it’s persuasive. It’s an OK argument, but
I was bothered by it. I found that it took me–
I wasn’t happy with it until I sat down
and really worked it out formally to
justify this somewhat handwavy proof of the lemma. All right. Let’s turn from uniform
random variables to binomial random variables. They’re probably the most
important single example of random variable that
comes up all the time. So the simplest definition
of a binomial random variable is the one that
you get by flipping n mutually independent coins. Now, they have an order,
so you can tell them apart. Or again, you can say that
you flip one coin n times, but each of the flips is
independent of all the others. Now, there’s two parameters
here, an n and a p, because we don’t assume
that the flips are fair. So there’s one parameter is
how many flips there are. The other parameter
is the probability of a head, which might
be biased that heads are more likely or
less likely than tails. The fair case would
be when p was 1/2. So for example, if
n is 5 and p is 2/3, what’s the probability that we
consecutively flip head, head, tail, tail, head? Well, because
they’re independent, the probability
of this is simply the product of the
probability that I flip a head on the
first toss, which is probability of H,
which is p; probability of H on the second toss;
probability of T on the third; T on the fourth; T on the fifth. So I can replace
each of those by 2/3 is the probability of a head. 2/3, 1/3. 1 minus 2/3 is the
probability of a tail. 1/3, 2/3. And I discover that the
probability of HHTTH is 2/3 cubed and 1/3 squared. Or abstracting the probability
of a sequence of n tosses in which there are i heads
and the rest are tails, n minus i tails, is
simply the probability of a head raised to the i-th
power times the probability of a tail, namely
1 minus p, raised to the n minus i-th power. Given any particular sequence
of H’s and T’s of length n, this is the probability that’s
assigned to that sequence. So all sequences with the
same number of H’s have the same probability. But of course, with
different numbers of H’s they have different probabilities. Well, what’s the probability
that you actually toss i heads and n minus i tails
in the n tosses? That’s going to be equal to the
number of possible sequences that have this property of
i heads and n minus i tails. Well, the number
of such sequences is simply choose the i
places for the n heads out of– choose the i places for
the heads out of the n tosses. So it’s going to be n choose i. So we’ve just figured out that
the probability of tossing i heads and n minus
i tails is simply n choose i times p to the i,
1 minus p to the n minus i. In short, the probability
that the number of heads is i is equal to this number. And this is the
probability that’s associated with whether
the binomial variable with parameters n
and p is equal to i is n choose i, p to the i,
1 minus p to the n minus i. This is a pretty basic formula. If you can’t memorize
it, then make sure it’s written on any crib
sheet you take to an exam. So the probability
density function, it abstracts out some
properties of random variables. Basically, it just
tells you what’s the probability that the
random variable takes a given value for every possible value. So the probability density
function, PDF of R, is a function on
the real values. And it tells you for each
a what’s the probability that R is equal to a. So what we’ve just said is
that the probability density function of the binomial
random variable characterized by parameters n and p at i
is n choose i, p to the i, 1 minus p to the n
minus i, where we’re assuming that i is an
integer from 0 to n. If I look at the
probability density function for a uniform
variable, then it’s constant. The probability density
function on any possible value v that the uniform variable
can take is the same. This applies for v
in the range of U. So in fact, you could
say exactly what it is. It’s simply 1 over the
size of the range of U, if U is uniform. A closely related
function that describes a lot about the behavior
of a random variable is the cumulative
distribution function. It’s simply the probability that
R is less than or equal to a. So it’s a function
on the real numbers, from reals to reals,
where CDF R of a is the probability that R
is less than or equal to a. Clearly given the PDF,
you can get the CDF. And given the CDF,
you can get the PDF. But it’s convenient
to have both around. Now the key
observation about these is that once we’ve abstracted
out to the PDF and the CDF, we don’t have to think about
the sample space anymore. They do not depend
on the sample space. All they’re telling
you is the probability that the random variable
takes a given value, which is in some ways, the
most important data about a random variable. You need to fall back on
something more general than the PDF or the
CDF when you start having dependent
random variables, and you need to know how
the probability that R takes a value changes, given
that s has some property or takes some other value. But if you’re just looking
at the random variable alone, essentially everything
you need to know about it is given by its density
or distribution functions. And you don’t have to worry
about the sample space. And this has the advantage that
both the uniform distributions and binomial distributions
come up [AUDIO OUT] –and it means that all of these
different random variables, based on different
sample spaces, are going to share a
whole lot of properties. Everything that I derive
based on what the PDF is is going to apply
to all of them. That’s why this abstraction
of a random variable in terms of a probability
density function is so valuable and key. But remember, the definition
of a random variable is not that it is a
probability density function, rather it’s a function from
the sample space to values.

2 Replies to “4.4.4 Random Variables: Uniform & Binomial: Video”

Leave a Reply

Your email address will not be published. Required fields are marked *