I recently wrote that I was rekindling my interest in the Collatz problem. I still haven't had the time to read any literature but I came up with an interesting transformation to the problem that seems to unearth some structure which is maybe not directly obvious. For regular readers of this blog, you might see the similarities with this transformation and the formula I posted earlier on calculating the points for interval halving using a closed formula. It really was the Collatz problem that inspired that formula. To the uninitiated, the Collatz problem is based on the sequence

The question is whether, for any positive integer starting value, the sequence will always reach the value 1.

One can reverse the problem in the following way. Define the set C in the following way

The problem then transforms to the question, whether *C* contains all positive integers, in other words: is ? In this form the problem can be represented as a tree with 1 at its root, every element *x* has at most two children. One child is always 2*x*, the other child is (*x* − 1) / 3 if that number is an integer.

Every odd number *x* trivially has an infinite number of descendants of the form 2^{k}*x*. I will call the set of these numbers the tower of *x*.

**Definition:** The tower *T*_{x} of an odd number *x* is the set . *x* is called the base of the tower.

We also define the children of *T* in the natural way, as a set of towers which are based on those children of the elements of *T* which are not elements of *T* themselves.

**Definition:** The children *C*_{T} of a tower *T* is the set

It's easy to show that every tower either has no children, if the base is a multiple of 3, or an infinite number of children otherwise.

I would like to find a way to map each tower onto a single number, i.e. every element *x* of *T*_{x} should be mapped onto the same value. To do this I take the largest power *k*_{x} of 2 that is smaller or equal to *x*: . We now define

The values of *t*_{x} lie between 0 and 1 and are rational numbers of the form

.

It is clear that *k*_{2x} = *k*_{x} + 1 and therefore *t*_{2x} = *t*_{x}. This means, all the elements of a tower are uniquely mapped onto a single value. We can, therefore, identify these values with the towers they originate from and we will refer to the towers *T* by their .

The next question that arises is, given an tower *t*_{T} what do the children of a tower look like. As an example, the children of *t*_{7} = 3 / 4 are plotted in the figure on the right. The values of the first few of these children are given in the following table.

Fraction | Decimal |

1 / 8 | 0.125 |

5 / 32 | 0.15625 |

21 / 128 | 0.16406 |

85 / 512 | 0.16602 |

341 / 2048 | 0.16650 |

It can be seen, and easily proved, that if *p* / 2^{k} is a child of any *t*, then so is (4*p* + 1) / 2^{k + 2}. This results in an geometric series that converges to

The figure above plots the children of the first 1000 towers against the values of the towers themselves. Interestingly a clear structure emerges. The sequences of the children all seem to converge against two straight line segments. These line segments can be represented by the function

When I first plotted this graph and discovered this very simple structure hidden in the Collatz problem, I was utterly amazed. Of course I don't know enough to decide if this structure could hold the key to unravelling the problem but it certainly gives me hope.

Finally, an apology to all those who know more about the problem. These things probably have been discovered already. I have not supplied any references and most likely got the terminology wrong. If so, I will be happy to be corrected.

This is very interesting.

ReplyDeleteIs it possible to get access to your data?