Monday, 16 May 2011

Collatz: Bringing some order into the chaos

I recently wrote that I was rekindling my interest in the Collatz problem. I still haven't had the time to read any literature but I came up with an interesting transformation to the problem that seems to unearth some structure which is maybe not directly obvious. For regular readers of this blog, you might see the similarities with this transformation and the formula I posted earlier on calculating the points for interval halving using a closed formula. It really was the Collatz problem that inspired that formula. To the uninitiated, the Collatz problem is based on the sequence
x_{n+1} = \begin{cases}
x_n/2 & x_n \text{ even}\\
3x_n + 1 & x_n \text{ odd}
\end{cases}

The question is whether, for any positive integer starting value, the sequence will always reach the value 1.

One can reverse the problem in the following way. Define the set C in the following way


\begin{alignat}{2}
&1 \in C\\
&\text{if } x \in C \quad\text{ then }\quad2x \in C\\
&\text{if } x \in C \quad\text{ and }\quad x \text{ mod } 3 = 1 \quad\text{ then }\quad\frac{x-1}{3} \in C
\end{alignat}

The problem then transforms to the question, whether C contains all positive integers, in other words: is C=\mathbb{N}? In this form the problem can be represented as a tree with 1 at its root, every element x has at most two children. One child is always 2x, the other child is (x − 1) / 3 if that number is an integer.

Every odd number x trivially has an infinite number of descendants of the form 2kx. I will call the set of these numbers the tower of x.

Definition: The tower Tx of an odd number x is the set T_x=\{2^k x | k=0,1,2,\ldots\}. x is called the base of the tower.

We also define the children of T in the natural way, as a set of towers which are based on those children of the elements of T which are not elements of T themselves.

Definition: The children CT of a tower T is the set

C_T = \{T_y | y = (x-1)/3 \quad\text{ where }\quad x \in T\quad\text{ and }\quad x \text{ mod } 3 = 1 \}

It's easy to show that every tower either has no children, if the base is a multiple of 3, or an infinite number of children otherwise.

I would like to find a way to map each tower onto a single number, i.e. every element x of Tx should be mapped onto the same value. To do this I take the largest power kx of 2 that is smaller or equal to x: 2^{k_x} \le x < 2^{k_x+1}. We now define

t_x := 2^{-k_x} x - 1

The values of tx lie between 0 and 1 and are rational numbers of the form

t_x = \frac{p}{2^k}.

It is clear that k2x = kx + 1 and therefore t2x = tx. This means, all the elements of a tower are uniquely mapped onto a single value. We can, therefore, identify these values with the towers they originate from and we will refer to the towers T by their t_T := t_x, x \in T.

The next question that arises is, given an tower tT what do the children of a tower look like. As an example, the children of t7 = 3 / 4 are plotted in the figure on the right. The values of the first few of these children are given in the following table.

Children of t = 3/4
Fraction Decimal
1 / 8 0.125
5 / 32 0.15625
21 / 128 0.16406
85 / 512 0.16602
341 / 2048     0.16650

It can be seen, and easily proved, that if p / 2k is a child of any t, then so is (4p + 1) / 2k + 2. This results in an geometric series that converges to

(3p+1)/(3\times 2^k)

The figure above plots the children of the first 1000 towers against the values of the towers themselves. Interestingly a clear structure emerges. The sequences of the children all seem to converge against two straight line segments. These line segments can be represented by the function

y = \begin{cases}
\frac{1}{3}( 4x + 1) & \quad \text{for } x< \frac{1}{2}\\
\frac{1}{3}( 2x - 1) & \quad \text{for } x> \frac{1}{2}
\end{cases}

When I first plotted this graph and discovered this very simple structure hidden in the Collatz problem, I was utterly amazed. Of course I don't know enough to decide if this structure could hold the key to unravelling the problem but it certainly gives me hope.

Finally, an apology to all those who know more about the problem. These things probably have been discovered already. I have not supplied any references and most likely got the terminology wrong. If so, I will be happy to be corrected.

1 comment:

  1. This is very interesting.

    Is it possible to get access to your data?

    ReplyDelete