mula; the problem is to detect the echo without knowledge of either the original object or the echo parameters. This is known as ‘blind echo cancellation’ in the signal processing literature and is known to be a hard problem in general.
We tried several methods to remove the echo. Frequency invariant filtering ,  was not very successful. In- stead we used a combination of cepstrum analysis and ‘brute force’ search.
The underlying idea of cepstrum analysis is presented
which contains a simple single echo, i.e. y(t) = x(t) + αx(t − τ). If Φxx denotes the power spectrum of x then Φyy(f) = Φxx(f)(1 + 2α cos(2πfτ) + α2) whose logarithm
2α cos(2πf τ ).
cos(2πτf) as a function of f. later function emphasises the frency’ τ.
The auto-covariance of this peak that appears at ‘que-
Fig. 7. When applied to images, the distortions introduced by Stir- Mark are almost unnoticeable: ‘Lena’ before (a) and after (b) StirMark with default parameters. For comparison, the same distortions have been applied to a grid (c & d).
transforms; for instance O’Ruanaidh et al.  suggest us- ing the Fourier-Mellin transform.
However, the general lesson from this attack is that given a target marking scheme, one can invent a distortion (or a combination of distortions) that will prevent detection of the watermark while leaving the perceptual value of the previously watermarked object undiminished. We are not limited in this process to the distortions produced by com- mon analogue equipment, or usually applied by end users with common image processing software. Moreover, the quality requirements of pirates are often lower than those of content owners who have to decide how much quality degradation to tolerate in return for extra protection of- fered by embedding a stronger signal. It is an open ques- tion whether there is any digital watermarking scheme for which a chosen distortion attack cannot be found.
B.2 Attack on echo hiding
As mentioned above, echo hiding encodes zeros and ones by adding echo signals distinguished by two different values for their delay τ and their relative amplitude α to a cover audio signal. The delays are chosen between 0.5 and 2 ms, and the relative amplitude is around 0.8 . According to its creators, decoding involves detecting the initial delay and the auto-correlation of the cepstrum of the encoded signal is used for this purpose. However the same technique can be used for an attack.
The ‘obvious’ attack on this scheme is to detect the echo and then remove it by simply inverting the convolution for-
We need a method to detect the echo delay τ in a signal. For this, we used a slightly modified version of the cep- strum: C ◦ Φ ◦ ln ◦Φ, where C is the auto-covariance func- tion (C(x) = E((x − x)(x − x)∗)), Φ the power spectrum density function and ◦ the composition operator. Exper- iments on random signals as well as on music show that this method returns quite accurate estimators of the delay when an artificial echo has been added to the signal. In the detection function we only consider echo delays between 0.5 and 3 ms (below 0.5 ms the function does not work properly and above 3 ms the echo becomes too audible).
Our first attack was to remove an echo with random rela- tive amplitude, expecting that this would introduce enough modification in the signal to prevent watermark recovery. Since echo hiding gives best results for α greater than 0.7 we could use ˆα – an estimator of α – drawn from, say a nor- mal distribution centred on 0.8. It was not really successful so our next attack was to iterate: we re-applied the detec- tion function and varied ˆα to minimise the residual echo. We could obtain successively better estimates of the echo parameters and then remove this echo. When the detection function cannot detect any more echo, we have found the correct value of ˆα (as this gives the lowest output value of the detection function).
B.3 Other generic attacks
Some generic attacks attempt to estimate the watermark and then remove it. Langelaar et al. , for instance, present an attack on white spread spectrum watermarks. They try different methods to model the original image
˜ and apply this model to the watermarked image I = I +W
ˆ to separate it into two components: an estimated image I
ˆ and an estimated watermark W such that the watermark
ˆˆ W does not appear anymore in I, giving ρ(I, W ) ≈ 0. The
authors show that a 3×3 median filter gives the best results. However an amplified version of the estimated watermark needs to be substracted because the low frequency com- ponents of the watermark cannot be estimated accuratly, leading to a positive contribution of the low frequencies and a negative contribution of the high frequencies to the cor-