Some Notes on CBC-Mode, IVs and MACs

I recently read this tweet which gave an example for why you should use good IVs in your crypto. The tweet was:

Why you should always use good IVs in your #crypto

This is the about the example that was given [1]

$ echo 'Give Eve $500' > plaintext
$ cat plaintext
Give Eve $500
$ xxd ciphertext
0000000: 53 61 6c 74 65 64 5f 5f 1a 3c cd 23 cc 4c 9f c9  Salted__.<.#.L..
0000010: a6 e2 57 10 7b 8d c3 23 ea 51 01 05 c9 8b 13 9f  ..W.{..#.Q......
$ openssl enc -d -aes-128-cbc -iv 00000000000000000000000000000000 -pass pass:hunter2 -in ciphertext
Give Eve $500
$ openssl enc -d -aes-128-cbc -iv 00000000000000000000030000000000 -pass pass:hunter2 -in ciphertext
Give Eve $600
$ openssl enc -d -aes-128-cbc -iv 00000000000719070000030000000000 -pass pass:hunter2 -in ciphertext
Give Bob $600

At first I liked the example, because of it's simplicity, but then I thought: Wait this isn't right. What does it have to do with randomness of the IVs? If we can manipulate the IV, we can flip any bit in the first block, regardless of the used IV. So let's do that with a random IV.:

$ cat plaintext
Give Eve $500
$ dd if=/dev/random bs=32 count=1 | xxd -p
0+1 records in
0+1 records out
16 bytes (16 B) copied, 0.000186855 s, 85.6 kB/s
$ openssl enc -p -aes-128-cbc -iv 01998163763025fd4577eb6f8a55eda5 -pass pass:hunter2 -in plaintext -out ciphertext2
iv =01998163763025FD4577EB6F8A55EDA5
$ openssl enc -d -aes-128-cbc -iv 01998163763025fd4577eb6f8a55eda5 -pass pass:hunter2 -in ciphertext2
Give Eve $500.
$ openssl enc -d -aes-128-cbc -iv 01998163763025fd4577e86f8a55eda5 -pass pass:hunter2 -in ciphertext2
Give Eve $600.

As we can see it's not really more difficult to perform such an attack with a random IV. (just some XOR calculation) But why does this work again? Well with CBC mode you always "mix" the last ciphertext block into your current block, to chain the blocks together. So every change in the plaintext changes the following ciphertext. But what to mix into the first block? Yes the IV. So we can write encryption as

\[ C_i = E_k(P_i \oplus C_{i-1}) \text{ with } C_0 = IV \]

And decryption is just the same process in reverse:

\[P_i = D_k(C_i) \oplus C_{i-1} \text{ with } C_0 = IV \]

Check out wikipedia [7] for a good illustration. But what we can see immediately is that for \(C_1\) the IV will be XORed against the resulting plaintext. Since the IV musst be prepended to the message in plain in order for the decryption to work, it can also be changed and this is why the attack shown above works.

Authenticated Encryption

So what this example actually demonstrates pretty good is that you need integrity/authenticity protection even though you encrypted the data. It also shows that you should include your IV in your integrity protection, when using CBC.

So that brings us to the discussion of how to combine encryption with authenticity guarantees. Apparently the recommended way is to "encrypt-then-mac", because it has certain security properties the other ways lack. E.g. there are ciphers that are malleable, which means an attacker can produce another valid ciphertext, given one ciphertext [6]. This doesn't concern us in the "encrypt-then-mac" case, in other cases it might. Although "mac-then-encrypt" and "encrypt-and-mac" are not per se insecure, they require that decryption is performed on unauthenticated data, which is intuitively bad (as we have seen in the CBC example above). Performing decryption on unauthenticated data has been called "the cryptographic doom principle" [3].

So what we would actually need to do, is to combine our primitives to encrypt and then authenticate the encrypted data and it's parameters.

\[ C = E_{key, IV}(P) \text{ and } m = HMAC_{key}(IV || C) || IV || C \]

There have been many advances in the area of combining encryption and authentication into one "do-it-all" cipher: GCM, EAX, CCM, CWX, to name a few. Galois Counter Mode (GCM) is apparently considered the best choice by the NIST and the NSA [10] and also seems to become the best choice for TLS (which is actually mac-then-encrypt). Apparently it's really hard to create a reasonably fast AES-GCM implementation in software that is not vulnerable against timing side channel attacks [4], although if you have hardware support, e.g. CPUs with AES-NI, it's a really fast and secure cipher.

Although these cipher modes have desirable properties, some people argue that they are too complicated and shouldn't be recommended to developers [2]. A simple combination of AES in CBC or CTR mode and a HMAC-SHA256 in the encrypt-then-mac fashion might be good enough and most of all simple enough to minimize the risk of implementation mistakes.

Also the combination "ChaCha20-Poly1305" is considered to offer very good security and very good speed also with software implementations on low-power devices, such as phones. Google is starting to support this ciphersuite for TLS connections [4]. Patches for NSS and OpenSSL are on the way [5]. As I understand ChaCha20 is a stream cipher and the successor to Salsa20 designed by Daniel J. Bernstein and Poly1305 a MAC, which utilizes any cipher (first proposed version was using AES, in this case it's also using ChaCha20) [9] .

All in all cryptography is still a pretty complex topics and all efforts to simplify it for people without security background are probably doomed. Although some libraries like google's keyczar and the nacl library seem to be able to do a pretty good job abstracting the nasty details, while some others don't [11].

For more information I really advise to read the blog posts [2], [4] and [3]. Furthermore any search for "authenticated encryption" in your favorite search engine should result in enough reading material on the topic.

[2](1, 2)
[3](1, 2)
[4](1, 2, 3)