From Las Vegas to Monte Carlo with Markov’s Inequality

One of the basic results in probability theory tells us that it is unlikely a random event deviates from its expected value with high probability.
Theorem 1. (Markov’s Inequality)
For any non-negative real valued random variable X, we have
{\text Pr}[X \ge \alpha] \le \frac{\mathbb{E}[X]}{\alpha}.

Proof.
Note that
\mathbb{E}[X] = \sum_{x \ge 0} x \cdotp p_X(x)
= \underbrace{\sum_{x<\alpha} x \cdotp p_X(x)}_{\ge 0} + \sum_{x \ge \alpha} x \cdotp p_X(x)
\ge \sum_{x \ge \alpha} \alpha \cdotp p_X(x)
\ge \alpha \cdotp \text{Pr}[X \ge \alpha],
as desired. \Box

There is an interesting application of this inequality in theoretical computer science. Knowing this probability bound, it allows us to control a probabilistic algorithm derived from another probabilistic algorithm.

There are at least two classes of probabilistic algorithms that are interesting when we look at in which aspect it is random:

  • a class of algorithms whose output is always correct, but its running time is random (can take very long time),
  • a class of algorithms whose running time is deterministic (for example, is a deterministic function in the length of input), but its answer is sometimes wrong.

The first class is usually called Las Vegas algorithm and the second class is usually called Monte Carlo algorithm. How are these two algorithms related?

It appears that if we can solve a problem using Las Vegas algorithm, we can also solve it using a Monte Carlo algorithm with reasonably small error probability. Let me argue why this is a useful fact. For many practical purposes, we want to have a decision right after certain amount of time; it is okay to make mistakes sometimes (as we may be able to handle it later).

To obtain a Monte Carlo algorithm, we can just simulate the Las Vegas algorithm with little modifications. Our modifications aim at fulfilling two requirements: it runs in a “fixed” time and it outputs an answer in a reasonably good probability. To fulfill the first requirement, we can just fix a threshold for the number of computational steps the Las Vegas algorithm takes, say t(x) steps for an input of size x. If after t(x) steps, our Las Vegas algorithm outputs an answer, we can just output the same answer in our Monte Carlo algorithm. This will always be a correct answer by the nature of Las Vegas algorithm. If not, we just output a made-up answer (depending on our need).

But, how should we choose t(x)? We have freedom here to choose the function t(x), but we may want to use this to fulfill the second requirement: to get a correct answer with good probability. Note that we can analyze our Las Vegas algorithm to obtain its expected running time on input x, say p(x). We know that our Monte Carlo algorithm will output a correct answer at least when the simulation finishes in t(x) steps. If X is the running time of the simulation of our Las Vegas algorithm, then the probability of having a correct answer is at least
\text{Pr}[X < t(x)] = 1- \text{Pr} [ X \ge t(x) ]
\ge 1 - \frac{ \mathbb{E}[X] }{t(x)}
= 1- \frac{p(x)}{3p(x)}
= \frac23,
which is good, if we pick t(x)=3p(x). Obviously, we can also pick larger t(x) to gain better probability. But, as long as we pick t(x)>2p(x), we can also amplify this probability by doing multiple simulations and process a final answer (in case of Boolean answer, for example, we do majority voting).

Note that we have described an abstraction of how to transform any Las Vegas algorithm to a Monte Carlo algorithm. This can be very useful if instead of getting an always-correct-answer, we know when we will get an answer and maybe do a post-processing to the answer afterwards.

Now, our curious mind would ask: Can we obtain a Las Vegas algorithm from a Monte Carlo algorithm? If yes, how? If not, why is it not allowed by nature? Are the collection of problems that can be solved using Las Vegas algorithms and the collection of problems that can be solved using Monte Carlo algorithms the same? If we talk about polytime algorithm, a mathematical-minded reader may want to read more about complexity class {\bf ZPP}, {\bf BPP}, and possibly other probabilistic complexity classes.

Advertisements