Padding Oracle attacks

In network protocols it's generally good practice to send an meaningful error message to clients. Examples from SMTP where one might see messages like "Unable to deliver because no such mailbox exists" help the sender realise that they have misaddressed an email, or an HTTP server might give a 404 error "Page not found" or a 401 error "Access Denied".

It turns out in cryptography that this is not such a good idea, and that leaking information about whether a block of cipher text is correctly padded or not can allows us to decrypt the whole of the text. There is a good wikipedia article here


Here is the definition of the PKCS7 padding function that which can be used in CBC mode cipher

The message is padded so it is a whole number of blocks. The value of the pad is the number of bytes that are required to be added. If the message is already a whole number of blocks then we add another block of padding.


The message "First Program is Hello World" is 28 bytes long. This is gives 1 full 16 byte block ( used in AES128 ) and part block which requires padding.

| F | i | r | s | t |   | P | r | o | g | r | a | m |   | i | s |
|   | H | e | l | l | o |   | W | o | r | l | d | 

So we pad it to a 16 byte block size by adding 4 bytes of 0x4 giving

| F | i | r | s | t |   | P | r | o | g | r | a | m |   | i | s |
|   | H | e | l | l | o |   | W | o | r | l | d |0x4|0x4|0x4|0x4|

CBC mode encryption

The message is broken into several blocks ${m_i}$. If the message is not an integer number of blocks we pad as described above.

Encryption is defined

$c_n=E(k,m_n \oplus c_{n-1})$

with $c_{-1}=IV$. $IV$ is an Initialisation Vector, $E(k,x)$ is a symmetric block cipher like AES or DES

Decryption is defined by

$m_n=D(k,c_n) \oplus c_{n-1}$

and we remove any padding on the last block.

The Attack

Note that in decryption we xor in the previous block of cipher text with the message plain text. If we manipulate that, and a padding oracle tells us whether we have a valid pad or not; then we can decrypt the entire cipher text a byte at time.

First byte

We have two consecutive blocks of cipher text $c_{i-1}$, and $c_i$

We guess that the last byte of the plaintext block $m_i$ is $g_i^{15}$

We submit the two blocks $ (c_{i-1} \oplus 000000000000000g_i^{15} \oplus 0000000000000001) || c_i $ to the padding oracle.

The padding oracle decrypts thus

$ D(k,c_i) \oplus (c_{i-1} \oplus 000000000000000g_i^{15} \oplus 0000000000000001) $

And since

$ D(k,c_i) \oplus c_{i-1} = m_i $

we get $ m_i \oplus 000000000000000g_i^{15} \oplus 0000000000000001 $

If $g_i^{15}=m_i^{15}$ where $m_i^j$ is the jth byte of plaintext block $m_i$


$ m_i \oplus 000000000000000g_i^{15} \oplus 0000000000000001 = m_{i}^0m_{i}^1m_{i}^2m_{i}^3m_{i}^4m_{i}^5m_{i}^6m_{i}^7m_{i}^8m_{i}^9m_{i}^{10}m_{i}^{11}m_{i}^{12}m_{i}^{13}m_{i}^{14}1$

Which has a valid pad, and our padding oracle will output true

Second byte

Now we have calculated the first byte we can repeat for the second byte

We submit $ (c_{i-1} \oplus 00000000000000g_i^{14}g_i^{15} \oplus 0000000000000022) || c_i $ to the padding oracle

Since we already have $ g_i^{15}=m^{15}_i $

the padding oracle outputs true if $g_i^{14}=m_i^{14}$

Third byte

Similary for the third byte

We submit $ (c_{i-1} \oplus 0000000000000g_i^{13}g_i^{14}g_i^{15} \oplus 0000000000000333) || c_i $ to the padding oracle

Since we already have $ g_i^{15}=m_i^{15} $ and $g_i^{14}=m_i^{14}$

the padding oracle outputs true if $g_i^{13}=m_i^{13}$


In this way we can decrypt each block a byte at a time, until we have all the plain text. Each byte has only 256 possibilities, so we decrypt a block with at most $2^8$ guesses per byte


The cryptography course sets this as a test in one of the programming questions. The oracle is available as a web service, and gives a 403 error if there is a bad pad, or a 404 error if the pad is correct, but the plaintext is wrong.

Here is my solution, it's quite fun to see the plaintext get discovered. The code could be improved quite a lot. The managing of the padding is hamfisted, and the speed could be improved by guessing a smaller subset of characters.