Thursday, 23 June 2011

Non-Vedic Maths: Half and five for the odds

In my series of Non-Vedic maths I will present mental arithmetic techniques that allow you to speed up your number-crunching. In contrast to Vedic maths, I will also explain why these techniques work. I believe that being able to add and multiply quickly does not make a good mathematician. But understanding how things work and being able to develop your own techniques is more important.

Enough already! I'm repeating myself!

Today's technique teaches how to multiply any number by 5, 50, 500 etc. The technique is based on the simple fact that

5 = 10/2

In this way multiplication by 5 is the same as multiplication by 10 and the division by 2. By using a technique for quickly dividing by 2 the method is accelerated further.

So let's start with an example. Say you want to multiply 3465 by 5. We start with multiplying by 10, i.e. adding a zero at the end

3465 × 5 = 34650 / 2

Dividing by two can be done in two ways. If you know your multiples of two up to 9×2=18 by heart then the traditional method is probably best. Just divide every digit from left to right including the carry over from the remainder. Note that the carry over is either 0 or 1, so the largest number that you will ever have to divide is 19. In our example we get

34650 / 2 = 17325

This is our result!

Some people are not as comfortable or quick with dividing numbers above 10 by 2. For these people the following alternative method might be faster. First divide every digit by 2 and forget about the remainder or carry over,


Next we shift the number to divide one place to the right (this was our original number) and replace every even digit with 0 and every odd digit with 5.


This second step effectively accounts for the carry overs that we missed in the first step. Now add the results of these two operations together


Because all the digits of the first number result from division of numbers less than ten, none of the digits can be larger than 4. For this reason we never have to worry about carry-overs when adding the two numbers.

For a slightly more mathematical explanation to what we have done here, let's write 34650/2 in the following way

34650/2 = (24640 + 10010)/2
        = 24640/2 + 10010/2
        = 12320 + 5005
        = 17325

To summarise, in order to multiply a number by five

  • append a zero at the right of the number
  • divide every digit by two, ignoring any remainders
  • add five to every digit to the right of an odd digit of the number that you got after the first step.

Wednesday, 15 June 2011

Catalan's theorem and Pillai's conjecture

Number theory always surprises by the seemingly simple questions which turn out to be extremely hard or maybe even impossible to answer. One such question concerns perfect powers of the type
where both x and n are natural numbers. Can we say, which numbers can be expressed as perfect powers? If we don't further restrict n then the answer is simple. Any number x can be represented as x1. So that means the set of numbers that can be represented as perfect powers is the complete set \mathbb{N}. So this question is boring.

But what if we exclude these trivial powers and say that n must be greater than 1? The list of these numbers is the sequence A001597 in OEIS. On the right is a plot of the first 100 of these numbers an. As you can see, the numbers seem to grow more than linearly. In fact for large numbers they seem to grow with n2. This seems to suggest that the gaps between two consecutive numbers also needs to grow. This is intuitively right. As the numbers grow, there are less and less candidates that could produce perfect powets.

But how large are the gaps exactly? More interestingly, as the gaps grow larger, do we still encounter some small gaps? The plot on the left shows the differences gn between two consecutive elements of the sequence plotted against the smaller number for the first 1000 or so numbers
gn = an + 1an,
i.e. we're plotting gn against an. It can be seen that most of the gn seem to lie close to a curve given by
G(x)=  2 \sqrt{x} + 1
which also limits the gn. This simple form is amazing at first, however it becomes quite clear when one looks at just the square numbers x2. If an is a square number then (\sqrt{a_n}+1)^2 is also a perfect square. The difference between the two numbers is 2 \sqrt{a_n}+1. Only few numbers seem to depart from this behaviour and have a lot smaller values. This always happens when there is at least perfect power, that is not a square, between two perfect squares.
So we have an upper bound for the gn. But is there a lower bound for the gn that increases with n. Or does gn always have excursions to small numbers? Let's look first at the smallest possible value gn = 1.
Catalan's theorem makes a statement about this particular case. It says that there is only one n for which gn = 1, or in the full notation
xnym = 1 has only one solution, which is 32 − 23 = 1
Until 2002 this was Catalan's conjecture but then it was proven by Preda Mihăilescu. Long before this proof Pillai generalised the the Catalan conjecture and speculated about the solutions of the equation
xnym = c for c > 1
The graph above seems to suggest that for c = 2 there is also just one solution
33 − 52 = 2
but to my knowledge this has not been proven. Pillai's conjecture now states that for any value of c, the above equation only has a finite number of solutions. This means that the gaps in the sequence of perfect powers not only grow, but there is some increasing lower bound for the gaps as well. Erdos went one step further and conjectured that this lower bound is given by
gn < nk
where k is a fixed constant which is greater than one.
From the plot of the first million gaps this does not immediately seem obvious. The figure shows the plot of the gaps gn against n for n up to 106 on a double-logarithmic scale. Although the lower bound does seem to grow, the value of the parameter k of the Erdos formula, of roughly 0.22 (green line), is determined by a single outlier at a384026 = 143384152921 and g384026 = 17. Almaost all other values of gn lie well above the Erdos curve. Only one other point comes close to the limiting curve at g745174 = 24
One more interesting feature that can be seen are strange lines in the plot that slope at a slightly lower exponent of about 2 / 3. This suggests that there is some hidden structure in the seemingly random behaviour of the gn. Another feature, which I noticed when plotting the gn is that there seem to be islands of regularity.
If we zoom in to the region from 7.4\times10^{11} to 7.6\times 10^{11} we get the following plot. Most points are again crowded around the upper limit but the others have the very regular shape of two parabolas intersecting each other.
As you can see, even formulas that seem simple can have surprisingly rich behaviour and open up very difficult problems.

Saturday, 11 June 2011

Is the bitcoin "Black Friday" really a crash?

I have been following the development of the internet currency bitcoin for the last few days. I heard a lot of people talking about it, so I thought it might be an interesting option for an online currency. The value of a bitcoin has been steadily increasing over the last few weeks up to a record high of over 30$. Yesterday saw a development in the value of a bitcoin that made the media speak of a "Black Friday". I won't go into the discussion about the morality of this online currency as there have been reports that the drug trade is turning to bitcoins as a payment. But here I want to ask the question: Was the "Black Friday" really a crash?

From the data I got from the API, I plotted the graph above. Naturally, we should plot any kind of exponential growth on a semi-logarithmic scale. The origin is some arbitrary point in the past, roughly 1000 hours before the time of writing this post. Older data is averaged over larger intervals, because it is not available any more as real-time data. One can see that, on average, the value of the currency has doubled every 7 days. I have also plotted an upper resistance and a lower support that both grow at this constant average growth rate.

You can see that the value of a bitcoin has kept to this channel between its resistance and its support for almost 1000 hours. And even today it only made a very short excursion below the support curve. Note that it could have made this kind of excursion 200 hours ago as well. The averaged nature of the data for that time period makes it impossible to judge exactly what happened then.

To be speaking of a "Black Friday" is therefore very premature. All that happened was that the over-exponential growth of the last few days has been compensated for. Of course it has to be seen wether the support will hold. But even if it doesn't, there is a good chance of a sideways trend to follow the rapid growth of the currency.

My conscience dictates that I write a warning and disclaimer at the end of this post. I am not endorsing the bitcoin currency and I am not encouraging to invest or to trade in bitcoins. Please be very careful as to where you invest your money and don't take any online article as the ultimate truth about the matter.

Thursday, 2 June 2011

Binary Trees in SQL: The Adjacency List

When learning to program, a key skill is to know you data structures. For every task there is the ideal data structure that makes the task easy and straightforward to implement. But then we are thrown into the real world. In this real world we are told to finish a piece of code with the restriction that the data has to be stored in a SQL database. Most of the time we think of databases essentially being arrays in which the data is stored in a tabular way. In the back of our head we know that a more sophisticated data structure would be more fit for the task, but we don't have the time or can't be bothered to think about these luxuries in more detail. One prime example is the tree data structure. Wouldn't it be great if there was a simple, straightforward way to implement trees in SQL?

I am going to show how this can be done. As it turns out there is more than one solution to the problem. In this post I will be writing about the first and maybe the most intuitive approach, the adjacency list. In the end this approach will turn out to be OK but not ideal for all applications. In two following posts I will then present more stable approaches to the problem, the nested set and the binary heap.

Adjancency List

The adjacency list is probably the implementation of a binary tree that is the easiest to understand and the one that most people would come up with if asked to implement a tree structure. Every entry in the tree connects two elements by a parent child relationship. We want to keep things neat and tidy, so we make sure that we separate data from structure. What this means is that we have a table with all the data, somewhat like this


Now we define the table that will hold the structure of the tree. Every entry in this tree will link a child node to a parent node and therefore has two references to our data table

CREATE TABLE bintree (parent INTEGER, child INTEGER NOT NULL, rel CHAR(1));

You will notice an extra column 'rel' in this table which can hold the two values 'L' or 'R'. Because we are storing a binary tree, each node is either the left or the right node of it's parent. By specifying the relation to it's parent we can perform proper binary searches in the table. Keeping the two tables separate allows us to create the tree structure for the data relatively independently of the data itself. We could also sort the same data set in two or more trees for different sorting criteria.

Now let's start populating our tables with data. When we insert a piece of data into our data table, we must also insert the relationship to the other data sets into the tree table. The next two insert statements do the job.

INSERT INTO people VALUES (1,'John'),(2,'Ben'),(3,'Sue');

INSERT INTO bintree VALUES (-1,1,''),(1,2,'L'),(1,3,'R');

But note how we have to take care that we retain the tree structure. The parent must exist and it can have a maximum of one left and one right child node. Only the root node does not have a parent which is denoted by a -1 in the parent column. This constraint can be tested using check statements, but I will not go into the details of that in this post. Deleting a leaf node from the tree is also done using two simple drop statements, for example to remove Ben from the tree we execute the following

DELETE FROM people WHERE id=2;

DELETE FROM bintree WHERE child=2;

We have to ensure manually that the node deleted in this way doesn't have any children of its own. If it does, the children will end up orphaned and unreachable. Checking this can again be done using check statements. The real problem arises when you try to delete a node and all it's children. The only way to achieve this is by going through all the elements which are reachable from the node and delete them individually. This can be quite time consuming if you have a larger database.

Another common task is to find out if one node A is the ancestor of another node C. Again, a number of SQL statements have to be issued. Starting from the child node C, we find it's parent

SELECT parent FROM bintree WHERE child=[C]

If this returns the ancestor A then we know that the two nodes are related by an ancestor-descendant relationship. If the statement returns nothing, then we have reached the root of the tree telling us that the two nodes are not related. If we get any other node then we issue the same command again, but with the new node as the child node. We continue this way until we have either found the ancestor or reached the root node. The PHP code for this algorithm is as follows

// a function that returns true if $parent is an ancestor of child and false otherwise
function isAncestor($parent,$child) {
  $node = $child;
  while ($node != -1) {
    $query="SELECT parent FROM bintree where child=".$node;
    $result=mysql_query($query); // should return exactly one result
    if ($node==$parent) return true;
  return false;

As you can see, the adjacency list layout for storing trees is not the ideal method. Some simple operations take multiple SQL queries and the consistency of the tree is not automatically ensured. Apart from orphaned nodes we have to avoid cyclic dependencies, where one node ends up to be it's own ancestor. In another post I will write about alternative methods that will alleviate some of these problems.

Wednesday, 1 June 2011

Two Ladders Puzzle: Answer

Previously I posted the Two Ladders Puzzle in which two ladders lean against opposite walls. The length of the ladders is given and so is the height of the point where they intersect. The task was to find the distance between the walls. The problem can be solved using Pythagoras and the law of similar triangles, so really it can classify as an easy problem. Nonetheless, the answer requires the solution of a quartic equation. We begin by naming the lengths of the ladders a and b and the, as yet unknown heights at which the ladders lean against the walls by x and y. Also let's divide the distance between the two walls into the parts m and n. We now have 5 unknowns, so we need 5 equations. The easy one is

m + n = d

Then we have Pythagoras for the two ladders

a2 = d2 + x2

b2 = d2 + y2

Finally we have the similar triangles

\frac{d}{x} = \frac{n}{h}

\frac{d}{y} = \frac{m}{h}

If we add these two equations, we get the sum (m + n) / h on the right hand side. But m + n = d and thus we can eliminate d to obtain the nice intermediate result

\frac{1}{h} = \frac{1}{x} + \frac{1}{y}

This means that h is the harmonic mean of x and y, independent of the values of a,b or d.

The rest of the solution will become a tiny bit messy. Instead of looking for a solution of d, we can look first for the values of x or y. Then we can use Pythagoras to find d. To do this we subtract the two Pythagorean equations to eliminate d

x2y2 = a2b2

Solving the harmonic equation for y we get

y = \frac{hx}{x-h}

With this we can eliminate y from the above difference of squares

x^2-\frac{h^2 x^2}{(x-h)^2} = a^2-b^2

Obviously this is a fourth order equation for x. I know this is not really a nice result so we will have to get some help for solving it. So let's insert the values and stick it into WolframAlpha. Of course there are four possible solutions. We get one positive, one negative and two complex solutions. Only the positive solution of x \approx 3.03692 is relevant. Now we can use the first Pythagoras to find the value for d

d \approx 2.60329

Friday, 27 May 2011

Tutorial: Midpoint rule

In a previous post I spoke about the trapezium rule for integrating functions numerically. The trapezium rule could find the value of an integral with second order accuracy. In this post I will present an alternative method, the midpoint rule. The geometric interpretation of the midpoint rule appears to be more crude than the trapezium rule. Nonetheless, it turns out that the midpoint rule also has second order accuracy.

Like the trapezium rule, we divide our interval [a;b] into N equal sections of width h = (ba) / N. The area of each section is now approximated by the area of the rectangle of width h and whose height is given by the function value at the centre of the section. The centre of the nth section is given by
xn + 1 / 2 = a + (n + 1 / 2)h.

Here n = 0,1,...,N − 1. As before, we define fn + 1 / 2 = f(xn + 1 / 2). The area of the nth rectangle is then given by hfn + 1 / 2. And the value of the integral is given by

\int_a^b f(x)\;dx = h( f_{1/2} + f_{3/2} + \ldots + f_{N-1/2}) = h\sum_{k=0}^{N-1}f_{k + 1/2}

This is the final form for the midpoint rule. As with the trapezium rule, the accuracy of the integral can be increased by increasing N, i.e. decreasing the step size h. Again we can plot the error of the numerical integral against the number of steps on a logarithmic scale.

In figure the error of the integral

\int_0^1\sin(\pi x) dx = \frac{2}{\pi}

when evaluated with the midpoint and the trapezium rule is plotted. Clearly, the midpoint rule has a second order error, as has the trapezium rule. At first sight this is somewhat surprising since the midpoint rule makes no attempt to incorporate the change in the function value over the sections into the approximation.

To understand the accuracy of the approximation, we have to re-interpret the midpoint formula hfn + 1 / 2 as the area of a trapezoid that intersects the function graph at the midpoint an which has a slope identical to the derivative of the function at that point. It is a simple, but very useful, geometric fact that the area of a trapezoid is the same as the area of a rectangle whose height is the average of the trapezoid's height.

Looking at the error curves a bit more closely, we can observe that the error of the midpoint rule is always slightly smaller than that of the trapezium rule. In fact when we divide the two error curves in the region where they decrease with N^2, we get an almost constant error ratio of about 1/2. This will lead us to improve the two methods. But this will be the subject of another post.

Wednesday, 25 May 2011

Two Ladders Puzzle

This puzzle is an old favourite of mine. It looks quite straightforward but it's not as easy as one might expect.

Two ladders are leaning against opposite walls in an alleyway. The bottom of each ladder is placed on the ground in the opposite corner. One ladder is 4m long, the other is 3m long. The height at which the two ladders meet is 1m.

Question: How far are the walls apart?

Friday, 20 May 2011

Short Thoughts: If a tree falls in the forest...

...does it make a sound. This seems to be a deep philosophical question, but is it really? It just depends on what your definition of sound is. If you define sound as pressure waves in air then the answer is clearly YES. The pressure waves are there, and might influence other objects in the forest.

If, on the other hand, you define sound as through the experience of a human being, then you are linking it to the perception of an individual. In this sense the answer is NO. The tree does not make a sound because there is no one whose brain activity is changed in such a way that she would call it a sound.

I find this definition problematic. What if you had a sound recording device? You could record the sound and play it back later. If no one was present when the device made its recording then where did the sound come from. Did the recording device create the sound from scratch? Or did the tree suddenly make a sound the moment you observe the sound through the recording device? In this way the question resembles the observer problem in quantum mechanics.

But if you go down this route then you have to be consistent and follow this argument through to its end. You should not allow anything to exist unless you are observing it. So the final answer to the question must be:

If a tree falls in the forest and no one is there to hear it then the tree didn't exist in the first place.

Wednesday, 18 May 2011

Non-Vedic Maths: One more here, one more there, do I care?

Multiplying two numbers whose last digits add up to powers of 10

This is my first post in the series of Non-Vedic maths, where I will be explaining speed calculation techniques. These techniques are NOT based on the ancient Indian Vedas but can be derived using elementary maths.

In this first instalment of the series I will be looking at how to multiply two numbers whose last digits add up to powers of 10 and whose leading digits are identical. To explain the method let us first look at two digit numbers. As an example, we multiply 63 with 67. For this method to work the first digit has to be the same for both numbers and the last digits have to add up to 10. Then we can write our to numbers as

10a + b and 10a + c

where b + c = 10 and a is the leading digit. In our example we have a = 6, b = 3 and c = 7. Multiplying the two numbers gives

(10a+b)(10a+c) &= 100a^2 + 10(ab + ac) + bc\\
&= 100a^2 + 10a(b+c) + bc\\
&= 100a^2 + 100a + bc\\
&= 100a(a+1)+bc

In the third step we used that b+c=10. The product bc will always be less than 100 so the digits will not interfere with the first term, which is a multiple of 100. In our example we have

63 * 67 = 100 * 6 * 7 + 3 * 7 = 100 * 42 + 21 = 4221

The rule that can be extracted from this is:

Take the first digit and multiply with one more than itself. Multiply the last two digits. The final answer is made up of the two results by joining the results together.

We can represent this graphically in the following way.

The method clearly also works if a is not a single digit number, but it can have any number of digits. If we wanted to multiply 116 with 114 we will get

100 * 11 * 12 + 6 * 4 = 13200 + 24 = 13224.

The method can be generalised if the trailing two digits add up to 100 or the trailing three digits add up to 1000 and so on. The maths is similar to the previous case.

(100a+b)(100a+c) &= 100a^2 + 100(ab + ac) + bc\\
&= 10000a^2 + 100a(b+c) + bc\\
&= 10000a^2 + 10000a + bc\\
&= 10000a(a+1)+bc\end{alignat}

Again, bc will always be less than 10000 because each of them is less than 100. To see an example, lLet's multiply the numbers 436 and 464. Here the last two digits add up to 100 and the leading digit is the same in both numbers.

436 * 464 &= 10000*4*5 + 36*64\\
&= 10000*20 + 2304\\
&= 202304

Of course one has to multiply two digit numbers, which can be a little more complicated. In another post I will present methods that can make this multiplication easier.

Non-Vedic Maths: An Introduction

In the past I have repeatedly been confronted with people who promote Vedic Mathematics. Vedic Maths is a system of mathematical procedures that is supposed to help you master mathematics. The people promoting this system claim that it is based on the old Indian Vedic Sutras which date back over 4000 years. A little study of the subject reveals a slightly different story. Vedic mathematics was first presented by an Indian mathematician called Bharati Krishna Tirthaji Maharaja, in the early part 1900s.

He claimed to have found uncovered the system by intensely studying the old Vedas. However, most scholars, that are not directly involved in promoting the system, agree that the Vedas do not actually contain any of the "Vedic mathematics" sutras. It is, therefore, much more likely that a Tirthaji invented the methods himself, or even copied them from other sources. Another argument against the origin from the Vedas is that the techniques described in Vedic maths heavily rely on the decimal number system. But this system was not invented until much later, after the Vedas were written down.

The Vedic maths system claims to provide calculation strategies which are creative and useful. But they don't really promote a deeper understanding of mathematics. All that these methods really do, is to provide the student with some shortcuts for performing standard calculations in their head. It is interesting to note that the mathematics behind the system is not very sophisticated and was certainly known at the time that a Tirthaji developed these methods.

So, quite clearly, Vedic maths does not date back to Vedic times and cannot be extracted from the Vedas. It is a modern invention that uses the claim of dating back to ancient times in order to promote itself. But all this doesn't answer the crucial question, is the Vedic maths system useful? As I said before, the methods provided do not really promote analytical thinking and, in todays world of computers and pocket calculators, they are not needed in everyday life, and they will not help the student in their future job. To make matters worse, the Vedic mathematics system does not explain why or how the techniques work. So the student is left with learning the methods by heart and repeat blindly. If you believe that education should involve more than pure repetition of facts, and you don't want your children to grow up being parrots, then avoid the Vedic maths system. For two more passionate articles against the system, you can read The Fraud of Vedic Maths or Stop this Fraud on our Children!

Arithmetic shortcuts do have their use however, and that is to impress your friends. If you can find the result of a complex product or sum before your friends can get out their iPhones and start the calculator app, then they will be truly impressed. And if you can still pull this feat after a few pints in the pub, you will be considered the maths genius for the rest of your life. For this reason alone, I will start a series of posts under the heading "Non-Vedic Maths". This series will present mathematical  parlour tricks and shortcuts for everyday calculations to impress your friends. I will also go the extra mile and explain why the techniques work. But be warned: learning these methods will not help you in any other way, and you will most certainly be branded a geek by everyone around you.

Read on for the first technique.

Monday, 16 May 2011

Collatz: Bringing some order into the chaos

I recently wrote that I was rekindling my interest in the Collatz problem. I still haven't had the time to read any literature but I came up with an interesting transformation to the problem that seems to unearth some structure which is maybe not directly obvious. For regular readers of this blog, you might see the similarities with this transformation and the formula I posted earlier on calculating the points for interval halving using a closed formula. It really was the Collatz problem that inspired that formula. To the uninitiated, the Collatz problem is based on the sequence
x_{n+1} = \begin{cases}
x_n/2 & x_n \text{ even}\\
3x_n + 1 & x_n \text{ odd}

The question is whether, for any positive integer starting value, the sequence will always reach the value 1.

One can reverse the problem in the following way. Define the set C in the following way

&1 \in C\\
&\text{if } x \in C \quad\text{ then }\quad2x \in C\\
&\text{if } x \in C \quad\text{ and }\quad x \text{ mod } 3 = 1 \quad\text{ then }\quad\frac{x-1}{3} \in C

The problem then transforms to the question, whether C contains all positive integers, in other words: is C=\mathbb{N}? In this form the problem can be represented as a tree with 1 at its root, every element x has at most two children. One child is always 2x, the other child is (x − 1) / 3 if that number is an integer.

Every odd number x trivially has an infinite number of descendants of the form 2kx. I will call the set of these numbers the tower of x.

Definition: The tower Tx of an odd number x is the set T_x=\{2^k x | k=0,1,2,\ldots\}. x is called the base of the tower.

We also define the children of T in the natural way, as a set of towers which are based on those children of the elements of T which are not elements of T themselves.

Definition: The children CT of a tower T is the set

C_T = \{T_y | y = (x-1)/3 \quad\text{ where }\quad x \in T\quad\text{ and }\quad x \text{ mod } 3 = 1 \}

It's easy to show that every tower either has no children, if the base is a multiple of 3, or an infinite number of children otherwise.

I would like to find a way to map each tower onto a single number, i.e. every element x of Tx should be mapped onto the same value. To do this I take the largest power kx of 2 that is smaller or equal to x: 2^{k_x} \le x < 2^{k_x+1}. We now define

t_x := 2^{-k_x} x - 1

The values of tx lie between 0 and 1 and are rational numbers of the form

t_x = \frac{p}{2^k}.

It is clear that k2x = kx + 1 and therefore t2x = tx. This means, all the elements of a tower are uniquely mapped onto a single value. We can, therefore, identify these values with the towers they originate from and we will refer to the towers T by their t_T := t_x, x \in T.

The next question that arises is, given an tower tT what do the children of a tower look like. As an example, the children of t7 = 3 / 4 are plotted in the figure on the right. The values of the first few of these children are given in the following table.

Children of t = 3/4
Fraction Decimal
1 / 8 0.125
5 / 32 0.15625
21 / 128 0.16406
85 / 512 0.16602
341 / 2048     0.16650

It can be seen, and easily proved, that if p / 2k is a child of any t, then so is (4p + 1) / 2k + 2. This results in an geometric series that converges to

(3p+1)/(3\times 2^k)

The figure above plots the children of the first 1000 towers against the values of the towers themselves. Interestingly a clear structure emerges. The sequences of the children all seem to converge against two straight line segments. These line segments can be represented by the function

y = \begin{cases}
\frac{1}{3}( 4x + 1) & \quad \text{for } x< \frac{1}{2}\\
\frac{1}{3}( 2x - 1) & \quad \text{for } x> \frac{1}{2}

When I first plotted this graph and discovered this very simple structure hidden in the Collatz problem, I was utterly amazed. Of course I don't know enough to decide if this structure could hold the key to unravelling the problem but it certainly gives me hope.

Finally, an apology to all those who know more about the problem. These things probably have been discovered already. I have not supplied any references and most likely got the terminology wrong. If so, I will be happy to be corrected.

Wednesday, 11 May 2011

The Three Wise Men Puzzle: Answer

This is the answer to the "Three Wise Men Puzzle" posted earlier.

To understand the reason the men stop laughing we follow the thoughts of one of the philosophers, let's call him A. A sees the other two men, B and C laughing. Assuming that he doesn't have a mark on his forehead, he thinks that B is laughing at C and vice versa. But he must also assume that B is unaware of his own mark on the forehead. If A had no mark, what does B think C is laughing about? In other words, if A's assumption was true, B should quickly realise that C is laughing at B and therefore B should stop laughing. But after enough time has passed, and B hasn't stopped, A must assume that he himself has a spot on his forehead.

Successive Interval Halving

In a previous post I talked about the trapezium rule for approximating integrals. One nice feature was that one can successively increase the resolution without having to evaluate the function at points at which it has already been evaluated once. The formula I gave was

If one uses this method from the onset, this will result at the function being evaluated at the following points

Step i    y = (xa) / (ba)
0 0
1 1
2 1/2
3 1/4
4 3/4
5 1/8
6 3/8
7 5/8
8 7/8

and so on. A few days ago I was creating a demonstration and I wanted to create a longer table with these values. I didn't want to create it by hand so I was looking for a closed formula that produces the values in the second column from those in the first column. After some pondering over the problem I came up with the following formula

y=(2n+1) 2^{-\lfloor\log_2(2n+1)\rfloor} - 1

Here the \lfloor.\rfloor denotes the floor function, i.e. the largest integer that is smaller or equal the argument. This formula will produce all the numbers in the table, including the zero, but not the one. The one, therefore, still has to be added by hand. Usually I would advise against using logarithms and exponentials in numerical algorithms but, because they are to the base 2, they can be implemented by bit shifting operations.

There is actually an interesting relation between 2n + 1 and y when we write them as binary numbers. In fact, if we write 2n + 1 as binary and place a decimal point after the leading 1 we get y + 1. Here is an example

2n+1    100111012
y+1 1.00111012
y 0.00111012

Tuesday, 10 May 2011

Wolfram's Target Audience

I am hooked on to Wolfram Alpha. I really think that it rocks and I can't live without it. Some days ago I wanted to bake a bread with my new bread baking machine. But I don't have any scales in the house to weigh the ingredients and the recipe tells me to use 250g of rye flour. But I do have a metric measuring cup. So I typed in density of rye flour and I get 540 g/dm^3 (grams per cubic decimeter) as answer. I know that a decimeter^3 is the same as a litre so a simple division tells me that I need 0.46l or 460ml of rye flour.

Yesterday I was using it for some more mathematical purpose when my eye fell on this ad for the Mathematica software. They use the slogan "One software to rule them all" a modification of a quote from "Lord of the Rings". Of course, my first reaction was "Wow, these guys are cool!" Then I calmed down a bit and started thinking about Wolfram's marketing strategy. My first thought was then replaced by the more sobering thought "Wow, these guys really know their target audience!" I am not really a fan of  marketing people because I think that targeted marketing aims at manipulating people. But funnily enough, in this case, I don't mind. Maybe, deep down, I think that the people who devised this ad had as much fun creating the ad as I have looking at it. But maybe that is exactly what they want me to believe. Maybe the marketing department of Wolfram are just a scheming bunch of cold blooded guys who know exactly what a geek like me would fall for. Maybe they hid the Easter eggs in Wolfram Alpha just to attract those nerds who have nothing better to do than to sit in front of the computer, trying to tease clever answers out of a machine.Well, they have succeeded. I think Mathematica is a great software I have had the pleasure to work with and I think, by making Wolfram Alpha and the Wolfram Integrator available to everyone, they have become more likeable. And yes, I will still waste my time by looking for Easter eggs that have not been posted on the internet yet.

Monday, 9 May 2011

Three Philosophers Puzzle (Three Wise Men Puzzle)

Here is a logic problem which I was told some years ago and I found it quite interesting at the time. I tried finding a reference to this problem on the internet but wasn't really successful.

EDIT: Thanks to one of the readers, who pointed out that the puzzle is actually well known and is called the "Three Wise Men Puzzle". A variation of the puzzle is the "Muddy Children Problem".

The problem goes like this:

The Three Philosophers  -  by Giorgione
Three philosophers are sitting in the shade of a tree discussing deep philosophical problems. They talk and talk and, of course, they don't reach an agreement or come any closer to a solution. After some hours of talking the afternoon sun makes the three men grow tired and they finally decide to take a nap. While the philosophers are sleeping a young boy from the local village passes by. He looks at the men and decides to play a prank on them. The boy takes some white paint and on each of their foreheads he paints a white spot that resembles bird droppings. After some time the philosophers wake up again and, looking at the others, they all burst out into laughter. Of course each of them thinks that a bird has relieved itself on their colleagues foreheads but none of them is aware that, he too, has a white spot on their forehead. In this way they keep on laughing, but they don't let the others know of their plight. But suddenly they all stop laughing, suddenly realising that they too must have the mark on their forehead.

Question: How, just by using logic, did they arrive at this conclusion?

The answer can be found here.

Friday, 6 May 2011

Tutorial: Trapezium Method

A subject I am particularly interested in is everything to do with numerical and mathematical algorithms. For this reason I decided that I am going to start a series on numerics. A second series will be on algorithms and discrete maths. The topics I am going to cover will not necessarily be published in any logical order as I will write about stuff that comes to my mind. But I will provide a page here which will act as a kind of index. In this way I hope that, after some time, I will be able to provide a tutorial on the subjects together with the odd little gem strewn in between.

Trapezium Method of Numerical Integration

When evaluating an integral using the trapezium rule the function f to be integrated is evaluated at discrete points xi between the integration boundaries a and b

xi = a + hi

where the interval size h is given by

h = \frac{b-a}{N},

N is the number of intervals and i=0\ldots N. I will also refer to N as the resolution. In practice N is often chosen to be a power of 2. There is no particular reason for this other than the fact that one commonly increases the resolution by a factor of 2 when improving the accuracy.

Let's abbreviate the notation by defining the discrete function values

f^N_i = f(x_i)

Here I used the upper index N to clarify that the discrete function values depend on the resolution. With this definition we can now write the trapezium approximation TN of resolution N to the integral as

\int_a^b f(x) dx \approx \frac{1}{2h}\left( f^N_0 + 2 \sum_{i=1}^{N-1} f^N_i + f^N_N \right) =: T_N

What this means geometrically is explained in the following graph. The integral of a function is the area below the graph of that function. The trapezium rule approximates this area by adding the areas of the trapezoids created by the x-axis and the function values at the discrete points. Each trapezoid has the area

\frac{1}{2h}\left( f_i + f_{i+1} \right)

Summing these areas up, one obtains the above formula. Let's do an example. We want to integrate the function sin(πx) between the limits 0 and 1. Using N=4 we get

T_4 = \frac{1}{8}\left( 0 +  1.41421 + 2.00000 + 1.41421 + 0 \right) =  0.60355

Comparing this with the exact solution

\int_0^1\sin(\pi x) dx = \frac{2}{\pi}

We see that, with this crude approximation, the error is about \varepsilon=0.03307

Of course one expects this approximation to get better as N increases and thus h decreases. In fact, one can show that, for sufficiently smooth functions

\int_a^b f(x) dx = T_N + O(h^2).

The above equation means that the error in the trapezium approximation decreases quadratically with the number of steps.

Let's see this on our example. In the diagram I plotted the error against the resolution on a double logarithmic plot. Up to N = 1e6 the curve is almost a straight line with a slope of -2. To show this the green line is the function \varepsilon \propto N^{-2}. This means that, in order reduce the error by a factor of four, one has to double the resolution.

One interesting feature of the trapezium rule is the fact that it makes the doubling of the resolution easy for us. When going from N to 2N, one does not have to evaluate the function at all the 2N points. Instead one can use the result TN of the previous approximation. The rule is

T_{2N} = \frac{1}{2} T_N + \frac{1}{2h} \sum_{i=1,3,\ldots}^{2N-1}

This formula makes it easy to successively increase the resolution until the required accuracy is achieved.

Finally I'd like to comment on the behaviour of the error when N gets really large. In the figure above you can see that the error levels off at about 2 × 10 − 7 when N increases above 2000. This error is above the numerical accuracy, which is about 10 − 8 in my calculations. When N increases further the error starts to increase again until, in the end it is even larger than before. The reason for this large error is that the individual terms in the sum become very small. Adding many small values up results in a loss of accuracy due to rounding errors.