FPT and kernelization

We prove the following theorem.

Theorem. A parameterized problem Q admits a kernelization iff it is in Q \in \text{FPT}.


(\Rightarrow) Suppose Q admits a kernelization. Then there is a polytime algorithm \mathcal{A} such that for all instance (I,k) of Q, the instance \mathcal{A}(I,k) is equivalent with (I,k) and that \text{size}_{\mathcal{A}}(k) \le g(k) for some computable function g. Suppose also A runs in n^c time where n=|I|+k. Note that the kernel is bounded by k and hence the processing time will also be bounded by some computable function in k, say h(k). Hence, the running time for deciding if (I,k) is a yes- or no-instance is h(k)+n^c \le h(k) \cdotp n^c (assuming h(k) \ge 1 for k \ge 1). Hence, Q is in FPT.

(\Leftarrow) Suppose that Q is in FPT. Hence, there exists an algorithm \mathcal{A} that decides every instance (I,k) of Q in f(k) \cdotp n^c for some computable function f and polynomial p. We want a kernelization algorithm. The kernelization algorithm simulates \mathcal{A} on input (I,k) in time of n^{c+1}. If it gives an output yes or no instance, then just pass this output. If it does not give an output yet, return (I,k) as this means that f(k) \cdotp n^c \ge n^{c+1}, i.e. n \le f(k). Hence, the input  size is already bounded by some function in k. Note that the kernelization algorithm runs in polytime. Hence, Q admits kernelization. \square


Non-deterministic version of accepting and deciding

{\bf NP} is the class of languages that can be decided in polynomial time by a non-deterministic Turing Machine. Now let {\bf NP}' be the class of languages that can be recognized in polynomial time by a non-deterministic Turing machine.

We can ask the same question: Is {\bf NP}={\bf NP}'?

Theorem. {\bf NP}={\bf NP}'.

The proof is almost the same, except that we use non-deterministic Turing Machine, instead of deterministic one. For completeness of the detail, I will write the proof.

Proof. Clearly, {\bf NP} \subseteq {\bf NP}'. Now, let L \in {\bf NP}'. Then there exists a non-deterministic Turing Machine that will accept x \in L in p(n) time, for some polynomial p. Now, we construct a non-deterministic Turing Machine M' that will decide L in polynomial time. Let M' on input x runs a simulation of M(x) in p(n) time. If it halts and ACCEPT, then M' also accepts. Otherwise, REJECT. \square

Still, the proof is non-constructive.

Non-constructive proof in complexity

Let {\bf P} be a class of languages that can be decided in polynomial time by a deterministic Turing Machine. Let {\bf P}' be a class of languages that can be accepted (recognized) in polynomial time by a deterministic Turing Machine.

Theorem. {\bf P}={\bf P}'.

This theorem appears in CLRS book. The proof is non-constructive, roughly as follows.

Proof. It is true that {\bf P} \subseteq {\bf P}' because if it can be decided, then it can be recognized. To prove {\bf P}' \subseteq {\bf P}, let L \in {\bf P}'. Because it can be recognized in polynomial time, then there is a polynomial p(n) such that L can be recognized by a Turing Machine M in p(n) time for input x of length |x|=n. We now construct a polynomial-time deterministic Turing Machine M' that decides L. Let M' is a Turing Machine with input x simulates M(x) up to p(n) time. If it halts and ACCEPT, then M' also ACCEPT. Otherwise, simply REJECT. \square

The main idea of the proof is that if an input x is in L, then the Turing Machine M must have definitive answer before the p(n)-th step. If not, then we are sure that x is not an element of L.

The non-constructive part of the proof is when the proof uses polynomial bound p(n). The only thing that is given is the fact that L \in {\bf P}, and the question is how can we find a polynomial bound for L? Is there a constructive method to find such polynomial p?

From Las Vegas to Monte Carlo with Markov’s Inequality

One of the basic results in probability theory tells us that it is unlikely a random event deviates from its expected value with high probability.
Theorem 1. (Markov’s Inequality)
For any non-negative real valued random variable X, we have
{\text Pr}[X \ge \alpha] \le \frac{\mathbb{E}[X]}{\alpha}.

Note that
\mathbb{E}[X] = \sum_{x \ge 0} x \cdotp p_X(x)
= \underbrace{\sum_{x<\alpha} x \cdotp p_X(x)}_{\ge 0} + \sum_{x \ge \alpha} x \cdotp p_X(x)
\ge \sum_{x \ge \alpha} \alpha \cdotp p_X(x)
\ge \alpha \cdotp \text{Pr}[X \ge \alpha],
as desired. \Box

There is an interesting application of this inequality in theoretical computer science. Knowing this probability bound, it allows us to control a probabilistic algorithm derived from another probabilistic algorithm.

There are at least two classes of probabilistic algorithms that are interesting when we look at in which aspect it is random:

  • a class of algorithms whose output is always correct, but its running time is random (can take very long time),
  • a class of algorithms whose running time is deterministic (for example, is a deterministic function in the length of input), but its answer is sometimes wrong.

The first class is usually called Las Vegas algorithm and the second class is usually called Monte Carlo algorithm. How are these two algorithms related?

It appears that if we can solve a problem using Las Vegas algorithm, we can also solve it using a Monte Carlo algorithm with reasonably small error probability. Let me argue why this is a useful fact. For many practical purposes, we want to have a decision right after certain amount of time; it is okay to make mistakes sometimes (as we may be able to handle it later).

To obtain a Monte Carlo algorithm, we can just simulate the Las Vegas algorithm with little modifications. Our modifications aim at fulfilling two requirements: it runs in a “fixed” time and it outputs an answer in a reasonably good probability. To fulfill the first requirement, we can just fix a threshold for the number of computational steps the Las Vegas algorithm takes, say t(x) steps for an input of size x. If after t(x) steps, our Las Vegas algorithm outputs an answer, we can just output the same answer in our Monte Carlo algorithm. This will always be a correct answer by the nature of Las Vegas algorithm. If not, we just output a made-up answer (depending on our need).

But, how should we choose t(x)? We have freedom here to choose the function t(x), but we may want to use this to fulfill the second requirement: to get a correct answer with good probability. Note that we can analyze our Las Vegas algorithm to obtain its expected running time on input x, say p(x). We know that our Monte Carlo algorithm will output a correct answer at least when the simulation finishes in t(x) steps. If X is the running time of the simulation of our Las Vegas algorithm, then the probability of having a correct answer is at least
\text{Pr}[X < t(x)] = 1- \text{Pr} [ X \ge t(x) ]
\ge 1 - \frac{ \mathbb{E}[X] }{t(x)}
= 1- \frac{p(x)}{3p(x)}
= \frac23,
which is good, if we pick t(x)=3p(x). Obviously, we can also pick larger t(x) to gain better probability. But, as long as we pick t(x)>2p(x), we can also amplify this probability by doing multiple simulations and process a final answer (in case of Boolean answer, for example, we do majority voting).

Note that we have described an abstraction of how to transform any Las Vegas algorithm to a Monte Carlo algorithm. This can be very useful if instead of getting an always-correct-answer, we know when we will get an answer and maybe do a post-processing to the answer afterwards.

Now, our curious mind would ask: Can we obtain a Las Vegas algorithm from a Monte Carlo algorithm? If yes, how? If not, why is it not allowed by nature? Are the collection of problems that can be solved using Las Vegas algorithms and the collection of problems that can be solved using Monte Carlo algorithms the same? If we talk about polytime algorithm, a mathematical-minded reader may want to read more about complexity class {\bf ZPP}, {\bf BPP}, and possibly other probabilistic complexity classes.

Salah mendefinisikan himpunan

Beberapa hari yang lalu saya (akhirnya pernah) mengerjakan soal matematika dengan mendefinisikan kelas/koleksi yang saya pikir merupakan himpunan, namun ternyata bukan. (Ini seakan-akan sebuah pencapaian tersendiri sebagai orang yang belajar logika hahaha). Awalnya saya bingung asisten dosennya menulis komentar bahwa yang saya definisikan itu bukan himpunan, kemudian saya berpikir mengapa itu bukan himpunan. Setelah berdiskusi dengan teman saya, Anna dan Marvin, saat makan siang, baru benar-benar yakin kalau yang saya definisikan bukan himpunan.

Salah satu intuisi paling mudah sesuatu itu bukan himpunan adalah bahwa ia merupakan koleksi yang “terlalu besar” untuk menjadi sebuah himpunan. Misalnya koleksi semua himpunan bukanlah sebuah himpunan. Koleksi semua kardinal atau ordinal juga bukan sebuah himpunan.

Nah, soal yang saya kerjakan ini adalah soal PR Teori Model. Saya membuktikan bahwa jika ada suatu model B  terhadap teori T sehingga teori tentang B tidak konsisten dengan A, maka ada suatu formula \varphi_B di teori tentang B yang tidak konsisten juga dengan A. Kemudian, yang saya lakukan adalah mengambil semua formula \varphi_B dari setiap model B dari T, d.k.l.

S=\{\varphi_B: B \vDash T\}.

Dari konstruksi S ini, S bukanlah sebuah himpunan jika T punya model dengan kardinalitas tak hingga. Mengapa? Karena menurut Teorema Skolem-Loewenheim versi upward maupun downward, jika T memiliki suatu model tak hingga, maka untuk sembarang kardinal tak hingga \kappa, maka T akan memiliki model dengan kardinalitas \kappa. Ini berarti, S memuat model dengan hampir semua kardinal. Ini mengindikasikan konstruksi S ini bukan merupakan himpunan.

Terus kenapa kalau S bukan himpunan? Ya, intinya kita belum tentu boleh mengaplikasikan operasi-operasi yang valid untuk himpunan kepada S, seperti tidak bisa mengambil subhimpunan dari S dengan properti tertentu. It is black magic!