When you try to solve an optimization or inverse problem, you most likely need to deal with the concept of adjoint operator. Beyond the mathematical definition, I lacked an intuitive understanding of what the “adjoint” really is.
In mathematics, the word “adjoint” has a number of somewhat related meanings, which makes it intricate to form a clear understanding. In linear algebra, the adjoint of a matrix can refer to its transpose in the case of real numbers or to the conjugate transpose in the case of complex numbers. A generalization of this is possible using linear operators in Hilbert spaces.
Let be a linear operator between Hilbert spaces. The adjoint of is defined as the linear operator fulfilling , where is the inner product in the Hilbert space .
So what intuitive explanation can we draw from this?
An excellent article by Jon Claerbout exposes much of the intuition about adjoint operators. In the following table, Claerbout presents some operators (in the left column) together with their adjoints (in the right column):
A spike given as input to any one of these forward operators experiences two possible effects: 1- phase effect, 2- amplitude effect. An essential property of adjoint operators is they undo the phase effect but they reapply the amplitude effect. In contrast, inverse operators try to undo both these effects.
The adjoint of “truncation” is “zero padding”. Intuitively, take a length-7 real vector . After removing the last two coefficients, you get a length-5 real vector . The adjoint operation needs to bring back the vector-length from 5 to 7. The most straightforward way is to zero pad, which yields . The resulting is not exactly equal to the original . However, it can be considered an approximation. For a closer approximation one can hope for the inverse, which is not obvious to derive here. In both cases, one can improve the answer by using some type of a-priori knowledge about the missing coefficients, for instance, that should be a sequence of increasing integers.
Inverse operators can be unstable in many problems encountered in geophysics, leading to boosting of noise or generation of artifacts. In these situations, using the adjoint to get a first approximation is a good “first step” and a safer place to start. Iteration after iteration, the result can be constrained and refined so that, ultimately, one gets very close to the correct solution.
For more intuition about the adjoint, take the following example: the adjoint of “stacking” is “spraying”. To illustrate this, imagine you have a length-3 vector . After stacking it, through linear operator , you get the sum of coefficients . The adjoint operator, , needs to map from a scalar (sum) to a length-3 vector. A guess for the adjoint operation can be to simply replicate, i.e. spray, the sum () into a length-3 vector and divide by the average, in which case we get . Of course, this is far from the original , which, as can be understood, is very hard to infer given only the sum of its coefficients.
Here again, the idea is the same, the adjoint is a simple first step towards getting to the solution: simply replicate to make a good guess of what a stacked input was before stack, and simply zero-pad to make a good guess of what a truncated input was before truncation. Several other examples of this kind are available in the table and can also be analyzed.
In summary, my intuitive understanding of the adjoint is that it tries, iteration after iteration, to undo the effect(s) introduced by the forward operator. Unlike the inverse operator, it does not aim to undo those effects in one single iteration. In this sense, the adjoint operator is a good and safer place to start than the inverse operator. Faced with a problem in life where fragile entities are involved, we can often think of taking simple and small measures to progressively solve the issue at hand with the least possible damage. In doing so, we are, perhaps unconsciously, looking for and using adjoint operations.