# From Fourier to Curvelets

Curvelets have emerged as an important tool in signal analysis. Like other transforms, the curvet transform represents signal in a different domain which *eases* analysis. In particular, curvelets offer several desired features in the decomposition space. Reading about curvelets the first time didn’t make much intuitive sense to me. Below is an attempt to expose some intuitive aspects. To set the stage, I will transition from the Fourier transform to wavelets, and then explain some of the motivations behind curvelets.

**Fourier and the uncertainty principle**

In the 1820’s Joseph Fourier had the remarkable insight that any signal can be represented as the sum of sines and cosines. The Fourier transform was named after him, and is one of the most used and celebrated tools in signal analysis. In short, the Fourier transform decomposes any signal, measured in the *time domain* for instance, into the sum of sine and cosine waves in the *frequency domain*. Geoscientists find this useful in analyzing signal versus noise in different frequency bands, among several other applications.

An important factor behind the wide popularity of the Fourier transform is the Fast Fourier Transform (FFT) algorithm. The most used version of the FFT algorithm was developed by Cooley and Turkey in 1965. This is one of the most important algorithms in signal processing and data analysis. It has a complexity of for a number of measurements , instead of which is the complexity of the “naive” version of the discrete Fourier transform.

Despite its universally recognized importance as a primary tool to analyze signals, the Fourier transform has a fundamental limitation that has to do with Heisenberg’s uncertainty principle. This principle in Physics states that you can know the speed at which a particle is traveling or it’s position, but you can’t effectively measure both at the same time. There is a tradeoff: if you want to become more certain about the speed of the particle, you’ll become less certain about its position, and vice versa. The level of certainty can also be thought of as the *resolution* between the two measurements. So how does this relate to the Fourier transform?

The Fourier transform has the same uncertainty limitation: *you can either know the frequency of a signal or it’s time, but you can’t know both*. By frequency, I mean instantaneous frequency. By time, I mean arrival time. If you reduce the analysis window in time, you become less certain about the Fourier representation in frequency, so spectral coefficients in this case are not reliable anymore. In the same way, having a reliable estimation of the frequencies contained in the signal entails a large analysis window in time, which is not suitable for time-localized signals. In short, the Fourier transform has a lack of resolution between the time and frequency domains.

**Fourier and the Gibbs phenomenon**

The Fourier transform is excellent in representing smooth functions in time with only a small number of Fourier coefficients. It can be shown that the *smoother* the function gets, the *faster* the decay in its Fourier coefficients. Therefore, the Fourier transform provides a sparse representation for smooth functions.

However, the Fourier transform has an inherent difficulty in representing *discontinuities* in signals. These discontinuities can be in the form of sudden jumps from low to high values. The difficulty lies in the fact the Fourier transform can’t represent such discontinuities with a small number of coefficients. In practice, it may be impossible to use all the terms in the Fourier series. Sometimes, using all terms can lead to undesirable processing artifacts, which is one of the main reasons why we seek sparsity. For signals with discontinuities, such as the square wave represented in red below, even a large number of coefficients in the Fourier domain will not be sufficient to reproduce the square wave *exactly*. This effect is known as Gibbs phenomenon, and it manifests itself in the form of ripples of increasing frequency closer to transitions bands of the square signal. Discontinuities in the signal lead to *slow decay* in the Fourier coefficients as can be noticed in the plot.

There is an important result quantifying the approximation rate in the Fourier transform based on the -term approximation error. With being the number of coefficients in the best Fourier representation of signal (by “best” I mean minimizing the least squares error between and ), the reconstruction error is proportional to .

The wavelet transform, explained below, comes to ameliorate the difficulty associated with the Gibbs effect, and to address the uncertainty problem between time and frequency.

**Wavelets: where time meets frequency**

Basically, wavelets are just “mini-waves”.

Rather than being a wave that goes on forever, like sines and cosines in the Fourier transform, a wavelet is a short “burst” of waves that dies off rapidly. Wavelets are useful because they are limited in time and frequency. As a consequence, only a small number of wavelet coefficients may be needed to represent a discontinuity in the signal. It is important to note that by “discontinuity” in this case we mean *point* discontinuity in 1D or 2D, which is different than discontinuities along *curves* that can be defined by edges in images for instance.

The time-limiting property of wavelets is important as it provides more resolution in the time domain. Instead of modeling a signal with an infinite wave (which is what Fourier does), wavelets can model with a finite wave that you can slide along the time domain by a *shift *factor (also known as translation factor). To deal with capturing the low and high frequencies in a signal, wavelets can be stretched and squeezed depending on a *scale *factor.

With the wavelet transform, the signal is decomposed into the sum of mini-waves that have different scales and need to be translated by different time shifts to reconstruct the signal. For a one-dimensional signal in the time domain, its wavelet transform has two dimensions (shift and scale), instead of one dimension (frequency), which is what’s offered by the Fourier transform. The extended domain provided by the wavelet transform (two dimensions instead of one) explains its superior resolution between time (through the shift factor) and frequency (through the scale factor).

The -term approximation error in the wavelet transform is proportional to , which is better than for the Fourier transform. This means that wavelets ameliorate the difficulty associated with discontinuities in the signal, since a smaller number of coefficients is required to accurately reconstruct the signal.

The shortcoming of the wavelet transform lies in the fact that, in the same way that many Fourier coefficients are needed to represent a (point) discontinuity in the signal, many wavelets are needed in order to reconstruct the *edges* in an image properly. And so in this case, the wavelet representation is not sparse. In a 2D sense, wavelets are isotropic: they can either be horizontal, vertical or diagonal, so they cannot adapt to *curved* geometrical structures featured by high-contrast edges in an image. This shortcoming is in fact intrinsic to both the wavelet and Fourier transforms: neither wavelets nor sinusoids sparsify two dimensional objects with edges (although wavelets are better than sinusoids).

The concept of wavelets opened my mind to what a transform really is. We are used to the Fourier transform and many consider it more intuitive than the wavelet transform. However, the notions of “scale” and “shift” are easy to comprehend. In comparison to “frequency”, they are at least as intuitive.

The failure of wavelets in edges motivated the development of an important tool called the curvelet transform.

**Curvelets: a more optimal signal representation**

Curvelets were first introduced by Candes and Donoho in 1999. Their development was motivated by the difficulty in representing curved edges in images. The approximation rate for curvelets can be shown to be proportional to as increases, which is close to the optimal rate . Curvelets are important in that they perform for edges in images the task that wavelets perform for (point) discontinuities in signals.

The main criticism about curvelets has been that it is not intuitive enough. A 2nd-generation curvelet transform was proposed that is more intuitive and considerably simpler, since it is based on a partitioning (also know as tiling) of the Fourier space. Different tiles in that space select smooth portions of the data that map as “needles” in the curvelet domain. The tiling is optimal and obeys a parabolic relationship between the length and width. The fact that this partitioning works so well in representing curved edges is both surprising and stimulating: surprising since it works without ever depending on the pre-selection of any “map” of the edges, and stimulating since this deterministic partitioning seems to work very well in capturing curved edges with no need of any adaptivity to the geometry of these curves.

In addition, the curvelet transform is based on an *anisotropic* scaling principle which is quite different from the isotropic scaling of wavelets (where direction in a multi-dimensional space can be either horizontal, vertical or diagonal). The elements in curvelet representation obey a special scaling law, where the length and the width of the support are linked by the relation . On top of that, curvelets feature *directivity* which is important in representing angles.

So curvelets have some merits compared to the Fourier transform. However, these merits come at a cost- the curvelet transform usually takes a longer time to compute. There has so far been no equivalent to the Fast Fourier transform algorithm for curvelets. In some cases, the curvelet transform can also be challenging to parameterize. Despite these shortcomings, curvelets represent an important addition to the arsenal of signal analysis tools and promises to be the Fourier transform of the future.

this is best document that I could ever read about curvelet transform.

Navid, glad to hear you found this article useful.

could you please state the difference between the following:

1. Curvelet Statistical Features (CSFs) and Curvelet Co-occurrence Features (CCFs).

2. Curvelets via USFFT (Unequally Spaced Fast Fourier Transform) and Curvelets via Wrapping.

3. the 1st gen CT and 2nd gen CT.

4. the algorithm where we use the (Sub-band Decomposition, Smooth Partitioning, Renormalization, Ridgelet Analysis) and the one where we (apply the 2d fft on image, Perform the product of descret localizing window and 2d fft image, wrapping, inverse 2d fft of each image at different scales and orientations ).

and an intuitive explanation of the wrapping implementation .!?

Youness– sorry about the delay in responding. Since I’m not an expert in Curvelets, I wouldn’t attempt answering your questions. However, there is a good book by Stephane Mallat that might contain some of the answers

http://links.uwaterloo.ca/amath391w13docs/Mallat3.pdf