After delving into the curious nature of spacetime and the effects of relativity on our understanding of time, today I decided to write a brief article on a more mundane topic - the Fourier transform.

The reason for this decision actually arose from an incident in the hospital. At one point I happened to be chatting to a member of the staff about the reconstruction algorithms for computed tomography imaging system (that is the physicists' equivalent of small talk) and we got on the topic of Fourier transforms. They are critical for reconstructing images from most forms of medical imaging, and yet they are seldom explained very well - and especially not to college students just trying to get a job in the field. 

However even among graduate students and tenured faculty, the Fourier transform is often viewed as this mathematical method and set of equations, with no true understanding of how or why it works. That is unfortunate, as the Fourier transform is a truly beautiful thing.

And so for that reason, today I will be presenting an intuitive explanation of the Fourier transform - and I promise to leave out the scary mathematics. Following that, I will (eventually) write another article on why the Fourier transform is so important for radiology and CT scanning. 

Let's begin by considering how digital photos are recorded and stored. For simplicity I will consider only grayscale images, such as the one shown below, but colour images work in a very similar way. 


A digital image is actually a two dimensional array of numbers. For most applications, this can be as small as one million individual numbers, or as large as twenty or thirty million numbers. The range of the numbers varies depending on the type of image, but whole numbers between 0 and 255 tend to be very common, as do numbers between 0 and 65, 535. For our purposes it doesn't really matter, and so we won't mention it again. 

The problem with storing images in this way is that the individual numbers give us very little information on the image as a whole. The fact that the top right corner has grayscale level 57 or that the bottom left is grayscale 25 tells us absolutely nothing about the nature of the image. We really cannot study the image unless we have all of the numbers, and even then there is very little useful information to be gained aside from our visual perceptions of the entire image.

However there is another way of storing an image, and it is much more useful for analyzing and collecting information. 

So what is the first piece of information that we need. We need to know the overall brightness of the image, so the first number we will write down is the average of all grayscale numbers in the image. For reasons that will become clear later, we will call this number F(0,0).


Next, we want to know if the image is brighter on the left or on the right. And so we will average all of the grayscale numbers on the right of the image and subtract the average of all of the grayscale numbers on the left side of the image. We will call this number F(1,0), and will define F(0,1) as the difference between the averages of the top and the bottom of the image.

                           

Next we consider whether the image is bright near the center or near the two outer edges. We define the number F(2,0) as the difference between the average of the middle half, and subtract the average of the two outer quarters. We define F(0,2) the same way from top to bottom.

                           

This process can be continued for higher numbers of divisions in both directions, such as F(2,2) shown below,



The final numbers having hundreds or thousands of divisions in each direction, depending on the resolution of the original image. 

Now we can combine all of these numbers to form a new image,



Although this image doesn't look very interesting, it actually contains significantly more information than the original did. The bright spot in the center is F(0,0), and its brightness means that the original image was relatively bright and with low contrast. The dark edges of this image tell us that the original image did not have a lot of fine details, but since they are not black there is some detail. The bright vertical line tells us that the image is not very symmetrical from top to bottom, but the lack of a horizontal line suggests some symmetry from left to right. 

And in fact, the information contained in this new image is sufficient to reproduce the original image. Every detail of our photograph is stored in this transformed image somewhere.

Unfortunately this is not the Fourier transform. But it is very close.

In the transform given above, there are sharp edges between the different regions. A pixel in the original image is either entirely on the left or entirely on the right when calculation F(1,0). In practice though we want to have pixels on the edge of the image be more left or more right than pixels near the center line. I promised not to give any mathematical formulae in this article, and so I won't mention that in the Fourier transform we replace this sharp divide with sine and cosine functions. Instead of average everything on the right and everything on the left equally, we will assign a weighting of one the pixels on the far right and a weighting of zero to pixels along the center line, and a weighting of negative one to the pixels on the far left (since they are being subtracted). 

In all of the other numbers that are given above, we will give pixels on the edge of each region a weighting of zero, and pixels in the center of each region a weighting of one. The pixels in between will be weighted using trigonometric functions, but the details aren't really important.

It is this transformation that we call the Fourier transform, and it has uses in every branch of physics and mathematics ranging from subatomic particle physics to medical imaging machines and up to studies of the entire Universe. But that is a topic for another day...