have: H(E) ≤ n − H(C) bit/pixel, so all the gain provided by compression is used for hiding. One could also take into account the the stego-text S and impose the constraint that n o i n f o r m a t i o n i s g i v e n a b o u t E , e v e n k n o w i n g S a n d C (a part of C typically the natural noise of the cover-text): k t h e t r a n s i n f o r m a t i o n s h o u l d b e z e r o T ( E ; ( C k , S ) ) = 0 . I n t h i s c a s e , i t c a n b e s h o w n t h a t H ( E ) ≤ H ( C k | S ) [ 1 3 So the rate at which one can embed ciphertext in a cover- object is bounded by the opponent’s uncertainty about the cover-text given knowledge of stego-text. But this gives an upper bound on the stego-capacity of a channel when for a provably secure system we need a lower bound. In fact all the theoretical bounds known to us are of this kind. In addition, the opponent’s uncertainty and thus the capacity might asymptotically be zero, as was noted in the context of covert channels . 0 ] .
This also highlights the fact that steganography is much more dependent on our understanding of the information sources involved than cryptography is, which helps explain why we do not have any lower bounds on capacity for em- bedding data in general sources. It is also worth noting that if we had a source which we understood completely and so could compress perfectly, then we could simply sub- ject the embedded data to our decompression algorithm and send it as the stego-text directly. Thus steganogra- phy would either be trivial or impossible depending on the system .
Another way of getting round this problem is to take advantage of the natural noise of the cover-text. Where this can be identified, it can be replaced by the embedded data (which we can assume has been encrypted and is thus indistinguishable from random noise). This is the philoso- phy behind some steganographic systems , ,  and early image marking systems  (it may not work if the image is computer generated and thus has very smooth colour gradations). It can also be applied to audio , ; here, randomising is very important because simple replacement of the least significant bit causes an audible modification of the signal . So a subset of modifiable bits is chosen and the embedding density depends on the observed statistics of the cover-signal  or on its psy- choacoustic properties .
It is also possible to exploit noise elsewhere in the sys- tem. For example, one might add small errors by tweaking some bits at the physical or data link layer and hope that error correction mechanisms would prevent anyone reading the message from noticing anything. This approach would usually fall foul of Kerckhoffs’ principle that the mecha- nism is known to the opponent, but in some applications it can be effective .
A more interesting way of embedding information is to change the parameters of the source encoding. An example is given by a marking technique proposed for DVD. The encoder of the MPEG stream has many choices of how the image can be encoded, based on the trade-off between good compression and good quality – each choice conveys one or more bits. Such schemes trade expensive marking techniques for inexpensive mark detection; they may be an
alternative to signature marks in digital TV where the cost of the consumer equipment is all-important .
Finally, in case the reader should think that there is any- thing new under the sun, consider two interpretations of a Beethoven symphony, one by Karajan the other one by Bernstein. These are very similar, but also dramatically different. They might even be considered to be different encodings, and musicologists hope to eventually dicrimi- nate between them automatically.
C. Robust marking systems
In the absence of a useful theory of information hiding, we can ask the practical question of what makes a marking scheme robust. This is in some ways a simpler problem (everyone might know that a video is watermarked, but so long as the mark is unobtrusive this may not matter) and in other ways a harder one (the warden is guaranteed to be active, as the pirate will try to erase marks).
As a working definition, we mean by a robust marking system one with the following properties:
Marks should not degrade the perceived quality of the
work. This immediately implies the need for a good quality metric. In the context of images, pixel based metrics are not satisfactory, and better measures based on perceptual models can be used , ;
Detecting the presence and/or value of a mark should
require knowledge of a secret;
If multiple marks are inserted in a single object, then
they should not interfere with each other; moreover if dif- ferent copies of an object are distributed with different marks, then different users should not be able to process their copies in order to generate a new copy that identifies none of them;
The mark should survive all attacks that do not de-
grade the work’s perceived quality, including resampling, re-quantisation, dithering, compression and especially com- binations of these.
Requirements similar to these are found, for example, in a recent call for proposals from the music industry . However, as we have shown with our attacks, there are at present few marking schemes, whether in the research literature or on commercial sale, that are robust against at- tacks involving carefully chosen distortions. Vendors when pressed claim that their systems will withstand most at- tacks but cannot reasonably be engineered to survive so- phisticated ones. However, in the experience of a num- ber of industries, it is ‘a wrong idea that high technology serves as a barrier to piracy or copyright theft; one should never underestimate the technical capability of copyright thieves’ .
Our current opinion is that most applications have a fairly sharp trade-off between robustness and data rate which may prevent any single marking scheme meeting the needs of all applications. However we do not see this as a counsel of despair. The marking problem has so far been over-abstracted; there is not one ‘marking problem’ but a whole constellation of them. Most real applications do not require all of the properties in the above list. For exam-