Introduction

The Collatz conjecture has the infamy of a black hole. Young, promising mathematicians are sucked into it, never to be seen again, because this conjecture has been unsolved for so long (and is quite difficult, as expected).

The statement of the Collatz conjecture is that nN+\forall n \in } and a function f(n)f(n) operating on it repeatedly, where f(n)={3n+1 if n is oddn2 if n is evenf(n) = 3n + 1 {2} the sequence of n⁠,f(n)⁠,f(f(n))⁠,f(f(f(n)))n, f(n), f(f(n)), f(f(f(n))) \cdots will eventually reach 1.1.

Additionally, the aforementioned sequence is referred to as the Collatz sequence.

As can be seen, it is simple to state, and yet there are still no solutions. A good introductory video to the conjecture can be found here: https://www.youtube.com/watch?v=094y1Z2wpJg by Veritasium.

Intuitively, the Collatz Conjecture makes sense. The of growth for nn in the function is 34<1⁠,{4} < 1, implying that the conjecture does eventually converge at 11 (Veritasium).

One might think that the growth rate is 32⁠,{2}, implying that sequences will eventually diverge, since nn either increases by a factor of 33 or decreases by a factor of 12.{2}. Since both cases happen with roughly equal probability (half of all numbers are odd, the other half are even), we can say that the overall growth factor is likely to be about 312=323 \cdot {2} = {2}

However, for all odd numbers, 3n+13n + 1 yields an even number (as 3 times an odd number is still odd, adding 1 makes it even). This means that we know the next move will be to divide the new value by 2, and then the parity of the number is random. This means that we can instead consider

f(n)={3n+12 if n is oddn2 if n is evenf(n) = {2} {2}

And thus the growth rate is, on average, 3212=34⁠,{2} \cdot {2} = {4}, which does imply an eventual convergence.

Interesting Facts

There are several interesting observations on the Collatz conjecture that are also pretty basic and serve as a nice introduction.

  • The Collatz conjecture can never yield a number that is a multiple of 3 except at the very beginning of the sequence, since all odd terms become 1mod31 \bmod 3 and all even terms cannot gain a factor of 3.3. This means, for example, that when looking at the numbers in Collatz sequences, only 23{3} of the natural numbers could be used in the sequence.

Cycle

The Collatz conjecture includes the following:

The only positive integer cycle (where the repeated application of f(n)f(n) reaches a value previously reached) is 4⁠,2⁠,1.4, 2, 1. As is evident, 4/2=2⁠,2/2=1⁠,4/2 = 2, 2/2 = 1, and 3(1)+1=4.3(1) + 1 = 4.

An important theorem that initially guided cycle length calculation follows. Define kk as the number of 3x+13x + 1 operations, and nn as the number of x2{2} operations. If there is a cycle besides the 4-2-1 cycle, name it’s lowest element as greater than some number R.R. Then,

log2(3)<nklog2(3+1R) (3) < {k} \leq \left( 3 + {R} \right)

(Sinisalo 2003).

We can continue with the proof!

Denote the Collatz sequence of values in the cycle such that each consecutive term is the result of applying the recursive function on the previous value. Denote this sequence c1⁠,c2⁠,\hdots⁠,cm. c_1, c_2, \hdots , c_m. Additionally, note that c1c_1 is the result of the Collatz function applied to cm.c_m.

Now, let’s look at the ratios of the successive terms in the sequence. This would be c2c1c3c1c1cm.{c_1} \cdot {c_1} \cdots {c_m}. Trivially, this telescoping series is equal to 1.

Then, note that every one of the nn operations where the previous value is divided by two has a ratio of 12.{2}. This means that these values, multiplied, give 12n.{2^n}. For the series above to equal 1, the product of the ratios of successive elements in each of the kk elements must also equal 2n.2^{n}.

Then, note that the ratio of successive elements for each of the kk elements is also greater than 33 and less than 3c1+1c1{c_1} (since we are operating 3x+13x + 1) and c1c_1 is the first and smallest value without loss of generality.

This means that the product of each of these ratios is bounded by 3k3^k and (3+1c1)k.\left( 3 + {c_1} \right)^k.

Thus, we can construct the following inequality: 3k<2n(3+1c1)k.3^k < 2^n \leq \left( 3 + {c_1} \right)^k.

Taking log2log_{2} of each section of the expression, and applying the log rule log(bk)=klog(b)⁠, \log (b^k) = k \log (b), we get that klog23<nklog2(3+1R)⁠, k {3} < n \leq k \left( 3 + {R} \right), and dividing all parts by k⁠,k, we get that log2(3)<nklog2(3+1R). (3) < {k} \leq \left( 3 + {R} \right).

This can be used for the following theorem:

Let ab{b} be the rational number with least possible denominator bb such that log2(3)<ablog2(3+1R). (3) < {b} \leq \left( 3 + {R} \right). Then, the least possible cycle length for a new cycle (not 4-2-1) is a+ba + b (Sinisalo 2003). A rigorous proof requires math outside the scope of this article, but can be found here! https://www.probleme-syracuse.fr/cargo/Sinisalo\_collatz.pdf.

However, a rough outline of the proof comes from taknig the floor function on this theorem and coming up with integer bounds for the values of aa and b⁠,b, as well as noting that a+b=n+k⁠,a + b = n + k, and using this to prove the theorem!

Then, calculating the actual value of ab{b} gives 630⁠,138⁠,897/397⁠,573⁠,379.630,138,897/397,573,379. The total length of a cycle is thus greater than 1⁠,027⁠,712⁠,276.1,027,712,276.

Thus, we have a huge lower bound, which is helpful! We are likely to be able to push this bound up even higher. In fact, Tomas Oliveira e Silva confirmed with a computer that all Collatz sequences with initial values up to 32532.702×10163 \cdot 2^{53} \approx 2.702 \times 10^{16} converge to the 4-2-1 cycle.

Another significant advancement in looking at cycles came in 2019 when Terrence Tao proved essentially the name of his paper “Almost all Collatz orbits attain almost bounded values.” He proved the Collatz Conjecture for almost all numbers, and with the following theorem:

For any function g(n)g(n) of positive integers such that limng(n)⁠,lim_{n \to \infty} g(n) \to \infty, the minimal value of the Collatz sequence must be less than or equal to g(n)g(n) for the starting integer nn in the Collatz sequence (Tao 2019).

A simple example of this is just ascending integers: 1⁠,2⁠,1, 2, \cdots. This theorem thus proves that the minimal value in a Collatz sequence starting with nn is always less than n.n.

Stopping Times

The stopping time of a number as the amount of iterations before a number reaches 1.1.

A lot of work has been done for the Collatz conjecture, but there has not been as much analysis on the actual stopping times - namely, which values are most common for the stopping time of a function and how the stopping times are viewed.

Graphing the stopping times with transitions is rather unhelpful due to the density of the lines.

images/1000 Collatz Stopping Times with Lines.png

Removing the connections between adjacent points, we can see the following graph of stopping times.

I calculated these stopping times just by running the recursive function, and with dynamic programming to speed up the process. Still, it was extremely fast, and easily replicable.

{L}{0.7 \textwidth} images/1000 Collatz Stopping Times.png

In fact, we can predict that this pattern will probably continue to a higher number of x-values, too. It is a pretty pattern, and there seems to be a few different values that have the highest amount of stopping times - clustered around 105, and around 25. A paper by Lagarias from 1985 presents similar results for the range just beyond 1000, with a continued cluster at 23 and at 80, both different numbers in terms of primality and value. It’s strange!

To confirm the continued pattern, we can look at 10,000 datapoints.

images/10000 Collatz Stopping Times.png

As shown in the larger figure, the pattern does continue. Taking the log of the values in the hopes that it could clarify the stopping times gives the following graph, but the graph does not have a clearer pattern.

{R}{0.5 \textwidth} images/10,000 log.png Log of 10,000 stopping times}

However, it is pretty cool to look at!

The stopping times have been calculated for all 1<n<2681 < n < 2^{68} as of 2022, and for all of these values, the Collatz conjecture reached 1.