Categories
Uncategorized

Escaping Gödel’s Incompleteness Theorem: Finding expressive yet decidable logic

Propositional logic is perfect for people who like to automate things like computer scientists. If you have a theory that can be described using only propositional logic, you can prove every theorem (true statement) in your theory using an algorithm in finite time. Moreover, you can also debunk any non-theorem in finite time, too. In logic jargon, this means that propositional logic has a proof system or mechanism that is sound (will never produce junks) and complete (produce all possible truths). In fact, SAT-solver is all you need. SAT-solver has entered industry for long time of course and a prestigious competition is running every year to make the algorithm faster.

Mathematicians like to argue using wild and abstract imaginations. They invent and talk about numbers, infinity, and what not. The proof of a theorem by a mathematician, including contemporary ones, can read controversial to other expert mathematicians due to its abstract nature; not only the objects in question, but also the way each step in the argument is carried.

Logic in mathematics usually comes later only to verify the arguments used. Unfortunately, many mathematicians will be frustrated if only propositional logic is allowed. They will be very limited when talking about properties of natural numbers, for example, which first order logic needs to come in play; probably equipped with mathematical induction to support their arguments. Sometimes, one even needs to lift the background logic to the second-order logic or even higher-order logic when the arguments are too complicated. For those who are not familiar, these are some kinds of logic which allow more expressive statements and sophisticated arguments to produce new truths.

The problem is, once a theory is already expressive enough to allow us to talk about natural number systems 0,1,2,3,\dots (e.g. Robinson Arithmetic is the most minimal such theory I can think of), there must be a true statement, true according to that theory, that we can never prove under some sound mechanism. This is a phenomenal result in mathematical logic by Kurt Gödel called Gödel’s Incompleteness Theorem in 1931. Many philosophical discussions arise about the limitation of mathematics in our search for truths; some exaggerates this result, too, sometimes based on a misinterpretation of the results.

Anyway, the remaining of this post will not be about the interpretation of Kurt Gödel’s work and its philosophical implication. It is interesting, but in this post, the goal would be to convey that we can still escape from this limitation by designing a new logic that is expressive enough to talk about complex stuff, yet still has sound and complete proof mechanism. Modal logic is probably favorites of many and I will probably post about it some day if time permits, but here I want to describe a mechanism to decide whether a statement is true or not in first order logic limited with unary predicate only.

First of all, propositional logic allows us to form a sentence composed of propositions and connect them with and, or, ifthen, and not. By proposition, I mean a sentence which simply describes a property of one or more objects, for example “Donald Trump is evil” and can be either true or false; a compound sentence then will be like “Donald Trump is evil and my mom hates him.

In first order sentence, we can introduce variables in our sentence such as “Someone is evil” or “Everyone is evil”. This sentence uses variables since we do not mention who exactly is evil. Using mathematical notations, we can write “Someone is evil” as (\exists x) E(x) where E(x) is read “x is evil” and (\exists x) is read “there exists an x“; the full sentence is read “There is an x such that x is evil”. “Everyone is evil” can be written as (\forall x) E(x) where (\forall x) is read “for every x“. The x is called variable and E(x) is a predicate (or, a property) that we want to attach to some objects.

A predicate can involve more than one objects. One can define a binary predicate H(x,y) as “x hates y” which is a predicate involving two objects, or a ternary predicate such as T(x,y,z) defined as “x is a mutual friend of y and z“. It can also involve more that two objects. A sentence such as “Someone is evil and hated by everybody” can be written as first order sentence (\exists x)(\forall y)(H(y,x) \wedge E(x)) using our logic symbols that we have described above.

This post tries to convince you that the fragment of of first order logic that only allows unary predicates has a sound and complete proof mechanism. More concretely, predicates that involve two or more objects such as H(x,y), T(x,y,z) and so on are not allowed in a sentence in our theory.

An example \mathbb{T} of a sentence in a first-order sentence which consists of only unary predicate is

((\forall x)P(x) \vee (\forall y)Q(y)) \wedge (\exists z)(\neg P(z) \wedge \neg Q(z))

which can be read “Everything is P or everything is Q, but there is something that is neither P nor Q“, for any meaning we attach to P and Q. In first order logic, this can never be true, for whatever the meaning of P and Q we use and any objects in place of x,y,z. This kind of sentence is called contradiction in logic. If you are not convinced yet, replace P and Q with “rich” and “beautiful”, respectively, and reflect about what the reading above tries to state when “everything” is changed to “everyone”.

To decide any first-order sentence with only unary predicate, the key idea is based on the following fact proved by Löwenheim-Skolem in 1915, which I rephrase frivorously:

If a first-order sentence \mathbb{S} can expressed using k unary predicates and \ell variables, and if ever \mathbb{S} is a true statement, there will be a world consisting of only 2^k \cdot \ell objects such that \mathbb{S} is also true in this (finite) world.

Löwenheim-Skolem, 1915

(Readers might have noticed and be disappointed that this theorem is proved way before Gödel proved his extraordinary results. But I would like to mention that the endeavor to find a decidable logic for more and more fragments of mathematics are still alive until today.)

The fact that $\mathbb{T}$ will be true in a finite world with only 2^k \cdot \ell objects, when it can be true, is so important that a computer can just focus on these finite worlds and brute-force on all possible interpretations of the predicates on each of the 2^k \cdot \ell objects (whether each object satisfies a certain predicate) and terminate eventually once it finishes traversing all the possibilities of these interpretations, which are also finite.

Our sentence \mathbb{T} has k=2 predicates, viz. P and Q, and \ell=3 variables, viz. x,y,z, so if \mathbb{T} can be true, it must be true too in some world that has 2^2 \cdot 3 = 12 objects in it. More concretely, imagine a world that only has 12 people in it where each person is named A, B, C, D, E, F, G, H, I, J, K, and L. Next, some of these people might have property P and Q. Again, without loss of generality, take for example, predicate P is “rich” and Q is “beautiful”. Under this interpretation, some of these 12 people are rich and some are beautiful. For example, a possible scenario would be A, B, C, F, and G being rich while the rest is poor and B , C, H, and L, are beautiful and the rest is ugly. Of course there are many more possible scenarios in this world about who’s rich and beautiful.

The Löwenheim-Skolem theorem states that if \mathbb{T} is ever true (in any world and any interpretation of P and Q), there must be a scenario in this world of 12 people that we have described above where \mathbb{T} is true. Thus, a computer can just search over all possible (2^{12})^2=16777216 scenarios to find out whether \mathbb{T} is true in at least one scenario: that there is a scenario where “every one of A, B, C, D, E, F, G, H, I, J, K, L is rich, or every one of A, B, C, D, E, F, G, H, I, J, K, L is beautiful, but at least one of them is neither rich nor beautiful”. If none of these scenarios in this world of only 12 people make \mathbb{T} true, then \mathbb{T} can never be true for any meaning we attach to P and Q and any objects with assign to x,y,z from any possible worlds, i.e. T is guaranteed to be a contradiction; it will be never true whatsoever. Indeed, it is also easier for us to check our intuition about whether \mathbb{T} can be true in this world of 12 people. We can imagine that if all these 12 people are rich, then none of them can be not-rich, since everybody is rich; contradicting the last premise about someone being not rich. Similarly, if the 12 people are beautiful, then the last premise, about someone being not beautiful, cannot also be satisfied since everybody is beautiful. The main weapon is this reduction to the world of only 12 people; we can just focus the checking on this world of 12 people. The finiteness entails that the number of states/scenarios that the computer needs to check is finite too and hence it just needs to check one by one and will eventually finish them all. If this reduction does not exist, the computer might need to run indefinitely and no guarantee that the procedure will ever give us an answer. In our example, the computer will finally check all 16777216 scenarios and find out that none of them makes \mathbb{T} true, hence we can be sure that \mathbb{T} can never be true no matter what the context is.

Now, one may ask: Is there a first-order statement that is ever true and how can we check it? The answer is yes and the following example will demonstrate it. Consider the following sentence \mathbb{T}':

(\exists x)P(x) \wedge (\exists y)Q(y) \wedge \neg(\exists z)( P(x) \wedge Q(x)).

If we agree to interpret P as “rich” and Q as “beautiful”, then the statement \mathbb{T}' will read “Somebody is rich and somebody is beautiful but nobody is both rich and beautiful.” The sentence has k=2 predicates and \ell=3 variables, so our decision algorithm would consider a world consisting 2^2 \cdot 3=12 objects (or people) again, let’s say A, B, C, D, E, F, G, H, I, J, K, L. Our computer can try to check all the possible 16777216 scenarios who’s rich and who’s beautiful and see if at least one of makes \mathbb{T}' true. The computer will eventually find, for example, that a scenario where the A, B, C, D, E, F are beautiful but not-rich, while G, H, I, J, K, L are rich but not-beautiful, is an instance of the world of 12 people where \mathbb{T}' is true! Thus the computer will say: “Yes, \mathbb{T}' can be true.” This is not the only possible scenario that makes \mathbb{T}' true though, but the Löwenheim-Skolem theorem guarantees that it will find at least one scenario in this world of 12 people; no need to check the other infinite possibilities.

The proof of the theorem will not be contained here but I hope this post has convinced you how limiting the expressiveness of logic can give us power to effectively separate what is right and wrong in a logical system. I would not say that we should drop the whole number system for example; to escape Gödel’s theorem. I think numbers are useful and beautiful and many mathematical objects are. But for many real-life problems, I believe, the general discussion in mathematics will automatically reduce to finite domains hence allowing us to brute-force using computer; if we are not constrained with efficiency issue.

One possible simple take on Gödel’s Incompleteness Theorem is thus as follows. Gödel’s Incompleteness Theorem shall not stop people doing mathematics to invent more truths, even though not all, in the mathematical universes, and possibly develop us with new languages to speak about the physical nature and other complex phenomena. Probably some of the truths that are not provable in the platonic world of mathematics do not really matter in real world when there is an urgency to deal with the relevant mathematics. In addition to it, hopefully, in real world, their theory will be reducible to something that is more tangible, be it immediately or after some amount of work on simplification and/or modification. This might be true since there is a big chance that in real world, the mathematics is forced to fit into discrete and finite domains. There is probably a chance that in such case, we might not need to use the full theory to solve the real-world problems, but simply just a fragment of it. Indeed, this is what many logicians are trying to do. Many logicians around the world, especially those who are motivated with computational aspect of truths, are absorbed and enthusiastic into finding fragments of logic that allow us to effectively and probably efficiently decide truths, but are expressive enough to talk about complex problems they need to tackle. Check out the development of modal logic and description logic, for example.

Do you agree with this view of Gödel’s Incompleteness Theorem? What do you think are the issues of this approach? Let me know what you think.

Categories
Uncategorized

Note on probability of placing rooks

Revoking a post from 5 years ago, I wrote a discussion about the probability of randomly placing rooks on a chessboard in distinct rows and columns. The precise question in general setting is as follows.

Problem. k rooks are placed on an n \times n chessboard randomly. What is the probability that the rooks are all in different rows and columns?

The previous post is about this problem with k=5 and n=8 and for such instance, the probability is very small; around 0.04935660144. This intrigues me; why so small?

So the number of ways putting k rooks on n \times n chessboard is P^{n^2}_k = n^2 \cdot (n^2-1) \cdot \dots \cdot (n^2-k+1). Now, to put k rooks so that they are all on different rows and columns, there are n^2 options for the first rook, (n-1)^2 options for the second rook, and so on up to (n-k+1)^2 options for the k-th rook. The probability is thus

\displaystyle p(n,k)=\prod_{i=0}^{k-1}  \dfrac{(n-i)^2}{n^2-i}.

We can see that, compared to its denumerator, the numerator decreases much faster each time i is incremented. But, how can we describe this assymptotically in k and n?

Using the following trick:

\dfrac{(n-i)^2}{n^2-i} \le 1-\dfrac{i}{n} \le \text{exp}\left(-\dfrac{i}{n} \right),

for any n \ge i \ge 0, we can prove that

p(n,k) \le \text{exp}\left( - \dfrac{k^2-k}{2n} \right).

Hence, the probability suffers an exponential decay in k^2/2n.

For k=n, we have that p(n,n) \le \text{exp}(-(n-1)/2).

Categories
Uncategorized

Reciprocals of natural numbers

Last time, I posted about the convergence of the reciprocals of palindromes. In this post, I will discuss the motivation of that problem back then.

It is known that the harmonic series \dfrac{1}{1}+\dfrac{1}{2}+\dfrac{1}{3}+\dots is divergent (unbounded; goes to infinity). But, if the sum of the reciprocals of the squares, \displaystyle \dfrac{1}{1^2}+\dfrac{1}{2^2}+\dfrac{1}{3^2}+\dots is convergent (has finite sum) and the limit is \dfrac{\pi^2}{6}. This is called Basel’s problem.

A standard approach of seeing the convergence of such series is through considering the p-th power of the elements of the series, where p is a positive real number. If we let

\displaystyle S_p= \sum_{n \in \mathbb{N}} \dfrac{1}{n^p} = \dfrac{1}{1^p}+\dfrac{1}{2^p}+\dfrac{1}{3^p}+\dots,

it is known that S_p is convergent if and only if p>1. So, a series as “tight to harmonic series” as \displaystyle \dfrac{1}{1^{0.99999}}+\dfrac{1}{2^{0.99999}}+\dfrac{1}{3^{0.99999}}+\dots will go to infinity while small increase on the power such as \displaystyle \dfrac{1}{1^{1.000001}}+\dfrac{1}{2^{1.000001}}+\dfrac{1}{3^{1.000001}}+\dots will never go beyond a finite number, making the sum finite.

A standard approach to analyze this “dichotomy” is by using integral

\displaystyle \sum_{n \in \mathbb{N}} \dfrac{1}{n^p} \le \lim_{t \rightarrow \infty} \int_1^t \dfrac{1}{x^p} \text{d}x

The later is finite when p>1 and infinite for 0 \le p \le 1.

But another “generalization” of figuring out why S_p is divergent for p=1 and convergent for p=2 is by investigating sum of reciprocals of some elements of \mathbb{N}.

Let S \subseteq \mathbb{N}, we want to know if

\displaystyle \sum_{n \in S} \dfrac{1}{n}

is convergent or not. The harmonic series is constructed with S=\mathbb{N} the full set while the Basel’s problem is constructed with S=\{k^2: k \in \mathbb{N}\} which is the set of all perfect squares.

The convergence then seemingly speaks about the “volume” of S. If we take elements in S too much, then the series \displaystyle \sum_{n \in S} \dfrac{1}{n} will be divergent like harmonic series, while if S has “modest or small” volume such as the set of perfect squares, the series ceases to be finite!

There are a lot of results of course on various S:

  • If S=\{2^k: k \in \mathbb{N}\}, the corresponding series is the familiar geometric series \dfrac{1}{2}+\dfrac{1}{2^2}+\dfrac{1}{2^3}+\dots which is convergent to 1. This is probably one of the most famous series and appears in paradox such as Zeno’s paradox.
  • If S=\{p \in \mathbb{N}: p \text{ is prime } \}, the corresponding series \dfrac{1}{2}+\dfrac{1}{3}+\dfrac{1}{5}+\dots is divergent. It is known that there are approximately \dfrac{x}{\ln x} prime numbers that are less than x by the prime number theorem; this is a bit more than the number of perfect squares.
  • If S=\{k^k: k \in \mathbb{N}\}, the corresponding series \dfrac{1}{1^1}+\dfrac{1}{2^2}+\dfrac{1}{3^3}+\dots is exactly the same as its integral “approximation” \displaystyle \int_0^1 \dfrac{1}{x^x} \text{d}x. This incidence is called Sophomore’s Dream. 🙂
  • If S=\{F_n: n \ge 0\} is the set of Fibonacci numbers defined by F_0=F_1=1 and F_{n+1}=F_n+F_{n-1}, the corresponding series is convergent (due to its “exponential nature”).
  • etc.

See more interesting results in its dedicated Wikipedia page: List of sum of reciprocals. Also, see the following paper published in 2016 by Ing. Pier Franz Roggero, Dr. Michele Nardelli, and P.i. Francesco Di Noto that tries to relate this kind of series with physics: Sum of the reciprocals of famous series: mathematical connections with some sectors of theoretical physics and string theory.

I think the following paper in 2013 by Jonathan Bayles and Dominic Klyve will provide a good historical context of reciprocal series in mathematics and computation: Reciprocals as Knowledge Metric: Theory, Computation, and Perfect Numbers.