Friday, 27 May 2011

Tutorial: Midpoint rule

In a previous post I spoke about the trapezium rule for integrating functions numerically. The trapezium rule could find the value of an integral with second order accuracy. In this post I will present an alternative method, the midpoint rule. The geometric interpretation of the midpoint rule appears to be more crude than the trapezium rule. Nonetheless, it turns out that the midpoint rule also has second order accuracy.

Like the trapezium rule, we divide our interval [a;b] into N equal sections of width h = (ba) / N. The area of each section is now approximated by the area of the rectangle of width h and whose height is given by the function value at the centre of the section. The centre of the nth section is given by
xn + 1 / 2 = a + (n + 1 / 2)h.

Here n = 0,1,...,N − 1. As before, we define fn + 1 / 2 = f(xn + 1 / 2). The area of the nth rectangle is then given by hfn + 1 / 2. And the value of the integral is given by

\int_a^b f(x)\;dx = h( f_{1/2} + f_{3/2} + \ldots + f_{N-1/2}) = h\sum_{k=0}^{N-1}f_{k + 1/2}

This is the final form for the midpoint rule. As with the trapezium rule, the accuracy of the integral can be increased by increasing N, i.e. decreasing the step size h. Again we can plot the error of the numerical integral against the number of steps on a logarithmic scale.

In figure the error of the integral

\int_0^1\sin(\pi x) dx = \frac{2}{\pi}

when evaluated with the midpoint and the trapezium rule is plotted. Clearly, the midpoint rule has a second order error, as has the trapezium rule. At first sight this is somewhat surprising since the midpoint rule makes no attempt to incorporate the change in the function value over the sections into the approximation.

To understand the accuracy of the approximation, we have to re-interpret the midpoint formula hfn + 1 / 2 as the area of a trapezoid that intersects the function graph at the midpoint an which has a slope identical to the derivative of the function at that point. It is a simple, but very useful, geometric fact that the area of a trapezoid is the same as the area of a rectangle whose height is the average of the trapezoid's height.

Looking at the error curves a bit more closely, we can observe that the error of the midpoint rule is always slightly smaller than that of the trapezium rule. In fact when we divide the two error curves in the region where they decrease with N^2, we get an almost constant error ratio of about 1/2. This will lead us to improve the two methods. But this will be the subject of another post.

Wednesday, 25 May 2011

Two Ladders Puzzle

This puzzle is an old favourite of mine. It looks quite straightforward but it's not as easy as one might expect.

Two ladders are leaning against opposite walls in an alleyway. The bottom of each ladder is placed on the ground in the opposite corner. One ladder is 4m long, the other is 3m long. The height at which the two ladders meet is 1m.

Question: How far are the walls apart?

Friday, 20 May 2011

Short Thoughts: If a tree falls in the forest...

...does it make a sound. This seems to be a deep philosophical question, but is it really? It just depends on what your definition of sound is. If you define sound as pressure waves in air then the answer is clearly YES. The pressure waves are there, and might influence other objects in the forest.

If, on the other hand, you define sound as through the experience of a human being, then you are linking it to the perception of an individual. In this sense the answer is NO. The tree does not make a sound because there is no one whose brain activity is changed in such a way that she would call it a sound.

I find this definition problematic. What if you had a sound recording device? You could record the sound and play it back later. If no one was present when the device made its recording then where did the sound come from. Did the recording device create the sound from scratch? Or did the tree suddenly make a sound the moment you observe the sound through the recording device? In this way the question resembles the observer problem in quantum mechanics.

But if you go down this route then you have to be consistent and follow this argument through to its end. You should not allow anything to exist unless you are observing it. So the final answer to the question must be:

If a tree falls in the forest and no one is there to hear it then the tree didn't exist in the first place.

Wednesday, 18 May 2011

Non-Vedic Maths: One more here, one more there, do I care?

Multiplying two numbers whose last digits add up to powers of 10

This is my first post in the series of Non-Vedic maths, where I will be explaining speed calculation techniques. These techniques are NOT based on the ancient Indian Vedas but can be derived using elementary maths.

In this first instalment of the series I will be looking at how to multiply two numbers whose last digits add up to powers of 10 and whose leading digits are identical. To explain the method let us first look at two digit numbers. As an example, we multiply 63 with 67. For this method to work the first digit has to be the same for both numbers and the last digits have to add up to 10. Then we can write our to numbers as

10a + b and 10a + c

where b + c = 10 and a is the leading digit. In our example we have a = 6, b = 3 and c = 7. Multiplying the two numbers gives

(10a+b)(10a+c) &= 100a^2 + 10(ab + ac) + bc\\
&= 100a^2 + 10a(b+c) + bc\\
&= 100a^2 + 100a + bc\\
&= 100a(a+1)+bc

In the third step we used that b+c=10. The product bc will always be less than 100 so the digits will not interfere with the first term, which is a multiple of 100. In our example we have

63 * 67 = 100 * 6 * 7 + 3 * 7 = 100 * 42 + 21 = 4221

The rule that can be extracted from this is:

Take the first digit and multiply with one more than itself. Multiply the last two digits. The final answer is made up of the two results by joining the results together.

We can represent this graphically in the following way.

The method clearly also works if a is not a single digit number, but it can have any number of digits. If we wanted to multiply 116 with 114 we will get

100 * 11 * 12 + 6 * 4 = 13200 + 24 = 13224.

The method can be generalised if the trailing two digits add up to 100 or the trailing three digits add up to 1000 and so on. The maths is similar to the previous case.

(100a+b)(100a+c) &= 100a^2 + 100(ab + ac) + bc\\
&= 10000a^2 + 100a(b+c) + bc\\
&= 10000a^2 + 10000a + bc\\
&= 10000a(a+1)+bc\end{alignat}

Again, bc will always be less than 10000 because each of them is less than 100. To see an example, lLet's multiply the numbers 436 and 464. Here the last two digits add up to 100 and the leading digit is the same in both numbers.

436 * 464 &= 10000*4*5 + 36*64\\
&= 10000*20 + 2304\\
&= 202304

Of course one has to multiply two digit numbers, which can be a little more complicated. In another post I will present methods that can make this multiplication easier.

Non-Vedic Maths: An Introduction

In the past I have repeatedly been confronted with people who promote Vedic Mathematics. Vedic Maths is a system of mathematical procedures that is supposed to help you master mathematics. The people promoting this system claim that it is based on the old Indian Vedic Sutras which date back over 4000 years. A little study of the subject reveals a slightly different story. Vedic mathematics was first presented by an Indian mathematician called Bharati Krishna Tirthaji Maharaja, in the early part 1900s.

He claimed to have found uncovered the system by intensely studying the old Vedas. However, most scholars, that are not directly involved in promoting the system, agree that the Vedas do not actually contain any of the "Vedic mathematics" sutras. It is, therefore, much more likely that a Tirthaji invented the methods himself, or even copied them from other sources. Another argument against the origin from the Vedas is that the techniques described in Vedic maths heavily rely on the decimal number system. But this system was not invented until much later, after the Vedas were written down.

The Vedic maths system claims to provide calculation strategies which are creative and useful. But they don't really promote a deeper understanding of mathematics. All that these methods really do, is to provide the student with some shortcuts for performing standard calculations in their head. It is interesting to note that the mathematics behind the system is not very sophisticated and was certainly known at the time that a Tirthaji developed these methods.

So, quite clearly, Vedic maths does not date back to Vedic times and cannot be extracted from the Vedas. It is a modern invention that uses the claim of dating back to ancient times in order to promote itself. But all this doesn't answer the crucial question, is the Vedic maths system useful? As I said before, the methods provided do not really promote analytical thinking and, in todays world of computers and pocket calculators, they are not needed in everyday life, and they will not help the student in their future job. To make matters worse, the Vedic mathematics system does not explain why or how the techniques work. So the student is left with learning the methods by heart and repeat blindly. If you believe that education should involve more than pure repetition of facts, and you don't want your children to grow up being parrots, then avoid the Vedic maths system. For two more passionate articles against the system, you can read The Fraud of Vedic Maths or Stop this Fraud on our Children!

Arithmetic shortcuts do have their use however, and that is to impress your friends. If you can find the result of a complex product or sum before your friends can get out their iPhones and start the calculator app, then they will be truly impressed. And if you can still pull this feat after a few pints in the pub, you will be considered the maths genius for the rest of your life. For this reason alone, I will start a series of posts under the heading "Non-Vedic Maths". This series will present mathematical  parlour tricks and shortcuts for everyday calculations to impress your friends. I will also go the extra mile and explain why the techniques work. But be warned: learning these methods will not help you in any other way, and you will most certainly be branded a geek by everyone around you.

Read on for the first technique.

Monday, 16 May 2011

Collatz: Bringing some order into the chaos

I recently wrote that I was rekindling my interest in the Collatz problem. I still haven't had the time to read any literature but I came up with an interesting transformation to the problem that seems to unearth some structure which is maybe not directly obvious. For regular readers of this blog, you might see the similarities with this transformation and the formula I posted earlier on calculating the points for interval halving using a closed formula. It really was the Collatz problem that inspired that formula. To the uninitiated, the Collatz problem is based on the sequence
x_{n+1} = \begin{cases}
x_n/2 & x_n \text{ even}\\
3x_n + 1 & x_n \text{ odd}

The question is whether, for any positive integer starting value, the sequence will always reach the value 1.

One can reverse the problem in the following way. Define the set C in the following way

&1 \in C\\
&\text{if } x \in C \quad\text{ then }\quad2x \in C\\
&\text{if } x \in C \quad\text{ and }\quad x \text{ mod } 3 = 1 \quad\text{ then }\quad\frac{x-1}{3} \in C

The problem then transforms to the question, whether C contains all positive integers, in other words: is C=\mathbb{N}? In this form the problem can be represented as a tree with 1 at its root, every element x has at most two children. One child is always 2x, the other child is (x − 1) / 3 if that number is an integer.

Every odd number x trivially has an infinite number of descendants of the form 2kx. I will call the set of these numbers the tower of x.

Definition: The tower Tx of an odd number x is the set T_x=\{2^k x | k=0,1,2,\ldots\}. x is called the base of the tower.

We also define the children of T in the natural way, as a set of towers which are based on those children of the elements of T which are not elements of T themselves.

Definition: The children CT of a tower T is the set

C_T = \{T_y | y = (x-1)/3 \quad\text{ where }\quad x \in T\quad\text{ and }\quad x \text{ mod } 3 = 1 \}

It's easy to show that every tower either has no children, if the base is a multiple of 3, or an infinite number of children otherwise.

I would like to find a way to map each tower onto a single number, i.e. every element x of Tx should be mapped onto the same value. To do this I take the largest power kx of 2 that is smaller or equal to x: 2^{k_x} \le x < 2^{k_x+1}. We now define

t_x := 2^{-k_x} x - 1

The values of tx lie between 0 and 1 and are rational numbers of the form

t_x = \frac{p}{2^k}.

It is clear that k2x = kx + 1 and therefore t2x = tx. This means, all the elements of a tower are uniquely mapped onto a single value. We can, therefore, identify these values with the towers they originate from and we will refer to the towers T by their t_T := t_x, x \in T.

The next question that arises is, given an tower tT what do the children of a tower look like. As an example, the children of t7 = 3 / 4 are plotted in the figure on the right. The values of the first few of these children are given in the following table.

Children of t = 3/4
Fraction Decimal
1 / 8 0.125
5 / 32 0.15625
21 / 128 0.16406
85 / 512 0.16602
341 / 2048     0.16650

It can be seen, and easily proved, that if p / 2k is a child of any t, then so is (4p + 1) / 2k + 2. This results in an geometric series that converges to

(3p+1)/(3\times 2^k)

The figure above plots the children of the first 1000 towers against the values of the towers themselves. Interestingly a clear structure emerges. The sequences of the children all seem to converge against two straight line segments. These line segments can be represented by the function

y = \begin{cases}
\frac{1}{3}( 4x + 1) & \quad \text{for } x< \frac{1}{2}\\
\frac{1}{3}( 2x - 1) & \quad \text{for } x> \frac{1}{2}

When I first plotted this graph and discovered this very simple structure hidden in the Collatz problem, I was utterly amazed. Of course I don't know enough to decide if this structure could hold the key to unravelling the problem but it certainly gives me hope.

Finally, an apology to all those who know more about the problem. These things probably have been discovered already. I have not supplied any references and most likely got the terminology wrong. If so, I will be happy to be corrected.

Wednesday, 11 May 2011

The Three Wise Men Puzzle: Answer

This is the answer to the "Three Wise Men Puzzle" posted earlier.

To understand the reason the men stop laughing we follow the thoughts of one of the philosophers, let's call him A. A sees the other two men, B and C laughing. Assuming that he doesn't have a mark on his forehead, he thinks that B is laughing at C and vice versa. But he must also assume that B is unaware of his own mark on the forehead. If A had no mark, what does B think C is laughing about? In other words, if A's assumption was true, B should quickly realise that C is laughing at B and therefore B should stop laughing. But after enough time has passed, and B hasn't stopped, A must assume that he himself has a spot on his forehead.

Successive Interval Halving

In a previous post I talked about the trapezium rule for approximating integrals. One nice feature was that one can successively increase the resolution without having to evaluate the function at points at which it has already been evaluated once. The formula I gave was

If one uses this method from the onset, this will result at the function being evaluated at the following points

Step i    y = (xa) / (ba)
0 0
1 1
2 1/2
3 1/4
4 3/4
5 1/8
6 3/8
7 5/8
8 7/8

and so on. A few days ago I was creating a demonstration and I wanted to create a longer table with these values. I didn't want to create it by hand so I was looking for a closed formula that produces the values in the second column from those in the first column. After some pondering over the problem I came up with the following formula

y=(2n+1) 2^{-\lfloor\log_2(2n+1)\rfloor} - 1

Here the \lfloor.\rfloor denotes the floor function, i.e. the largest integer that is smaller or equal the argument. This formula will produce all the numbers in the table, including the zero, but not the one. The one, therefore, still has to be added by hand. Usually I would advise against using logarithms and exponentials in numerical algorithms but, because they are to the base 2, they can be implemented by bit shifting operations.

There is actually an interesting relation between 2n + 1 and y when we write them as binary numbers. In fact, if we write 2n + 1 as binary and place a decimal point after the leading 1 we get y + 1. Here is an example

2n+1    100111012
y+1 1.00111012
y 0.00111012

Tuesday, 10 May 2011

Wolfram's Target Audience

I am hooked on to Wolfram Alpha. I really think that it rocks and I can't live without it. Some days ago I wanted to bake a bread with my new bread baking machine. But I don't have any scales in the house to weigh the ingredients and the recipe tells me to use 250g of rye flour. But I do have a metric measuring cup. So I typed in density of rye flour and I get 540 g/dm^3 (grams per cubic decimeter) as answer. I know that a decimeter^3 is the same as a litre so a simple division tells me that I need 0.46l or 460ml of rye flour.

Yesterday I was using it for some more mathematical purpose when my eye fell on this ad for the Mathematica software. They use the slogan "One software to rule them all" a modification of a quote from "Lord of the Rings". Of course, my first reaction was "Wow, these guys are cool!" Then I calmed down a bit and started thinking about Wolfram's marketing strategy. My first thought was then replaced by the more sobering thought "Wow, these guys really know their target audience!" I am not really a fan of  marketing people because I think that targeted marketing aims at manipulating people. But funnily enough, in this case, I don't mind. Maybe, deep down, I think that the people who devised this ad had as much fun creating the ad as I have looking at it. But maybe that is exactly what they want me to believe. Maybe the marketing department of Wolfram are just a scheming bunch of cold blooded guys who know exactly what a geek like me would fall for. Maybe they hid the Easter eggs in Wolfram Alpha just to attract those nerds who have nothing better to do than to sit in front of the computer, trying to tease clever answers out of a machine.Well, they have succeeded. I think Mathematica is a great software I have had the pleasure to work with and I think, by making Wolfram Alpha and the Wolfram Integrator available to everyone, they have become more likeable. And yes, I will still waste my time by looking for Easter eggs that have not been posted on the internet yet.

Monday, 9 May 2011

Three Philosophers Puzzle (Three Wise Men Puzzle)

Here is a logic problem which I was told some years ago and I found it quite interesting at the time. I tried finding a reference to this problem on the internet but wasn't really successful.

EDIT: Thanks to one of the readers, who pointed out that the puzzle is actually well known and is called the "Three Wise Men Puzzle". A variation of the puzzle is the "Muddy Children Problem".

The problem goes like this:

The Three Philosophers  -  by Giorgione
Three philosophers are sitting in the shade of a tree discussing deep philosophical problems. They talk and talk and, of course, they don't reach an agreement or come any closer to a solution. After some hours of talking the afternoon sun makes the three men grow tired and they finally decide to take a nap. While the philosophers are sleeping a young boy from the local village passes by. He looks at the men and decides to play a prank on them. The boy takes some white paint and on each of their foreheads he paints a white spot that resembles bird droppings. After some time the philosophers wake up again and, looking at the others, they all burst out into laughter. Of course each of them thinks that a bird has relieved itself on their colleagues foreheads but none of them is aware that, he too, has a white spot on their forehead. In this way they keep on laughing, but they don't let the others know of their plight. But suddenly they all stop laughing, suddenly realising that they too must have the mark on their forehead.

Question: How, just by using logic, did they arrive at this conclusion?

The answer can be found here.

Friday, 6 May 2011

Tutorial: Trapezium Method

A subject I am particularly interested in is everything to do with numerical and mathematical algorithms. For this reason I decided that I am going to start a series on numerics. A second series will be on algorithms and discrete maths. The topics I am going to cover will not necessarily be published in any logical order as I will write about stuff that comes to my mind. But I will provide a page here which will act as a kind of index. In this way I hope that, after some time, I will be able to provide a tutorial on the subjects together with the odd little gem strewn in between.

Trapezium Method of Numerical Integration

When evaluating an integral using the trapezium rule the function f to be integrated is evaluated at discrete points xi between the integration boundaries a and b

xi = a + hi

where the interval size h is given by

h = \frac{b-a}{N},

N is the number of intervals and i=0\ldots N. I will also refer to N as the resolution. In practice N is often chosen to be a power of 2. There is no particular reason for this other than the fact that one commonly increases the resolution by a factor of 2 when improving the accuracy.

Let's abbreviate the notation by defining the discrete function values

f^N_i = f(x_i)

Here I used the upper index N to clarify that the discrete function values depend on the resolution. With this definition we can now write the trapezium approximation TN of resolution N to the integral as

\int_a^b f(x) dx \approx \frac{1}{2h}\left( f^N_0 + 2 \sum_{i=1}^{N-1} f^N_i + f^N_N \right) =: T_N

What this means geometrically is explained in the following graph. The integral of a function is the area below the graph of that function. The trapezium rule approximates this area by adding the areas of the trapezoids created by the x-axis and the function values at the discrete points. Each trapezoid has the area

\frac{1}{2h}\left( f_i + f_{i+1} \right)

Summing these areas up, one obtains the above formula. Let's do an example. We want to integrate the function sin(πx) between the limits 0 and 1. Using N=4 we get

T_4 = \frac{1}{8}\left( 0 +  1.41421 + 2.00000 + 1.41421 + 0 \right) =  0.60355

Comparing this with the exact solution

\int_0^1\sin(\pi x) dx = \frac{2}{\pi}

We see that, with this crude approximation, the error is about \varepsilon=0.03307

Of course one expects this approximation to get better as N increases and thus h decreases. In fact, one can show that, for sufficiently smooth functions

\int_a^b f(x) dx = T_N + O(h^2).

The above equation means that the error in the trapezium approximation decreases quadratically with the number of steps.

Let's see this on our example. In the diagram I plotted the error against the resolution on a double logarithmic plot. Up to N = 1e6 the curve is almost a straight line with a slope of -2. To show this the green line is the function \varepsilon \propto N^{-2}. This means that, in order reduce the error by a factor of four, one has to double the resolution.

One interesting feature of the trapezium rule is the fact that it makes the doubling of the resolution easy for us. When going from N to 2N, one does not have to evaluate the function at all the 2N points. Instead one can use the result TN of the previous approximation. The rule is

T_{2N} = \frac{1}{2} T_N + \frac{1}{2h} \sum_{i=1,3,\ldots}^{2N-1}

This formula makes it easy to successively increase the resolution until the required accuracy is achieved.

Finally I'd like to comment on the behaviour of the error when N gets really large. In the figure above you can see that the error levels off at about 2 × 10 − 7 when N increases above 2000. This error is above the numerical accuracy, which is about 10 − 8 in my calculations. When N increases further the error starts to increase again until, in the end it is even larger than before. The reason for this large error is that the individual terms in the sum become very small. Adding many small values up results in a loss of accuracy due to rounding errors.