[Course home page] [Lecture notes]
YALE UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE
CPSC 467a: Cryptography and Computer Security
Professor M. J. Fischer | September 22, 2005 |
Lecture Notes 7
1 Message authentication codes (MACs)
So far in the course, we have been discussing only a single
cryptographic application, namely, secret message transmission from
Alice to Bob over a a publicly-readable channel. Our goal has been to
maintain privacy in the face of an eavesdropper Eve.
Once we assume that Eve has the power to modify messages and generate
her own messages as well as eavesdrop, life becomes more difficult.
We usually call that kind of a malicious active adversary "Mallory"
(to distinguish him from Eve, who only eavesdrops).
1.1 Attacks by malicious active adversaries
Encryption alone no longer solves Alice and Bob's problem. Alice
sends c=Ek(m), but Bob may receive a corrupted or forged c′ ≠ c. How then does Bob know that the message he receives really was
sent by Alice?
The naive answer is that Bob computes m′ = Dk(c′), and if m′
"looks like" a valid message, then Bob accepts it as having come
from Alice. The reasoning here is that Mallory, not knowing k, could
not possibly have produced a valid-looking message.
For any particular cipher such as DES, that assumption may or may not
be valid, but here are two things to watch out for:
- There are three successively easier possible attacks in which Mallory
might produce fraudulent messages:
-
He might produce c′ = Ek(m′) for a message
m′ of his choosing.
-
He might produce a message c′ for which the
corresponding plaintext m′ is a valid message, even though
he could not choose m′ in advance, nor perhaps he does not
even know what m′ is.
-
He might be able to alter a legitimate
message c from Alice to produce a new message c′ that
corresponds to an altered form m′ of the true message m.
For example, if m represents an amount of money, it is
conceivable that Mallory could find the encryption of m+1
given the encryption of m, without knowing either m or
m+1.
Attack (1a) is similar to, but not the same as, the notion
of breaking the cryptosystem that we have been studying. We have been
asking that it be hard for Eve to compute m = Dk(c) knowing c but
not k. To carry out attack (1a) requires that Mallory
compute Ek(m) knowing m but not k. It's conceivable that he
could do the latter without being able to do the former.
One form of attack (1b) clearly is possible, the
so-called replay attack. This is when Mallory substitutes a
legitimate old encrypted message c′ for the current message c. It
can be thwarted by adding timestamps and/or sequence numbers to the
messages, so that Bob can recognize when old messages are being
received. Of course, this only works if Alice and Bob anticipate the
attack and incorporate appropriate countermeasures into the protocol
they are using.
However, even if replay attacks are ruled out, a cryptosystem that is
secure against attack (1a) might still permit attack
(1b). There are all sorts of ways that Mallory can generate
values c′. What gives us confidence that Bob won't accept one of
them as being valid?
Attack (1c) might be possible even in a cryptosystem
that is free from attacks (1a) and (1b). For
example, if c1 and c2 are encryptions of valid messages,
perhaps so is c1 ⊕c2. Whether or not it is depends
entirely on particular properties of Ek. It does not follow in
general from the difficulty of decrypting a given ciphertext. We
will see some cryptosystems later which do have the property of
being vulnerable to attack (1c). In some contexts, this
can actually be a useful property, as we will see.
-
Cryptosystems are not always used to send natural language or other
highly-redundant messages. For example, suppose Alice wants to send
Bob her password to a web site. Knowing full well the dangers of
sending passwords in the clear over the internet, she chooses to
encrypt it instead. Since passwords are supposed to look like random
strings of characters, Bob will likely accept anything he gets from
Alice. He could be quite embarrassed (or worse) claiming he knew the
correct password when in fact the password he thought was from Alice
was actually a fraudulent one derived from a random ciphertext c′
produced by Mallory.
What Alice and Bob need to solve their problem is called a
Message Authentication Code or MAC. A MAC is
generated by a function function Ck(m) that can be computed by
anyone knowing a secret key k. However, it should be hard for an
attacker to find any pair (m, ξ) such that ξ = Ck(m) without
knowing k. In fact, this should remain hard even if the attacker
knows a set of valid MAC pairs { (m1, ξ1), …, (mt, ξt)}, so long as m itself is not the message in one of the known
pairs.
Using a MAC, Alice can send both c = Ek(m) and also ξ = Ck(m).
Bob receives c′ and ξ′, possibly different from what Alice sent.
Bob computes m′ = Dk(c′) and then checks that ξ′ = Ck(m′). If
the check succeeds, then Bob accepts m′ as a valid message from
Alice. We say that Mallory successfully cheats if Bob accepts m′ as
valid but either m′ ≠ m or Alice sent no message at all.
1.2 Computing MACs
A block cipher such as DES can be used to compute a MAC by making use
of one of the ciphertext chaining modes, CBC or CFB. (See
Lecture notes 4.) In these modes, the last ciphertext block ct
depends on all t message blocks m1, …, mt. Therefore, we
define Ck(m) = ct. The result of this process is reputed to be a
good MAC generation function. Note that the MAC is only a single
block long, which in general is much shorter than the message. A MAC
acts like a checksum for preserving data integrity, but it has the
advantage that an adversary cannot compute a valid MAC for an altered
message.
2 Asymmetric cryptosystems
A major advance in cryptography is the modern development of
asymmetric (also called 2-key or public key)
cryptosystems. The idea is simple. Instead of having a single key
k that is used by both Alice and Bob, an asymmetric cryptosystem has
a pair of related keys, an encryption key e and a
decryption key d. Alice encrypts a message m by computing
c = Ee(m). Bob decrypts by computing m = Dd(c).1 As always, the decryption
function inverts the encryption function, so the following is always
satisfied:
What makes asymmetric systems useful is the additional requirement
that it be difficult to find d from e, and more generally, that it
be difficult to find m given both e and c = Ee(m). Thus, the system
remains secure even if the encryption key e is made public!
There are several reasons to make e public. One is that it is then
possible for anybody to send a private message to Bob. Sandra need
only obtain Bob's public key e and send Bob Ee(m). Bob recovers
m by applying Dd to the received ciphertext. This greatly
simplifies the key management problem, for there is no longer the need
for a secure channel between Alice and Bob for the initial key
distribution (which I have carefully avoided talking about so far).
Of course, an active adversary Mallory can still do some nasty things.
For example, he could send his own encryption key to Sandra when she
attempts to obtain Bob's key. Not knowing she has been duped, Sandra
would then encrypt her private data in a way that Bob could not read
but Mallory could! If Mallory wanted to carry out this charade for a
longer time without being discovered, he could intercept each message
from Sandra to Mallory, decrypt the message using his own decryption
key, then re-encrypt it using Bob's public encryption key and send it
on its way to Bob. Bob, receiving a validly encrypted message, will
be none the wiser to Mallory's shenanigans. This is an example of a
man-in-the-middle attack.
The security requirements for an asymmetric cryptosystem are more
stringent than for a symmetric cryptosystem. For example, the system
must be secure against large-scale chosen plaintext attacks, for Eve
can generate as many plaintext-ciphertext pairs as she wishes using
the public encryption function Ee().
The public encryption function also gives Eve a way to check the
validity of a potential decryption. Namely, if Eve guesses that
Dd(c) = m0 for some candidate message m0, she can check her
guess by testing if c = Ee(m0). Thus, whether or not there is
redundancy in the set of meaningful messages is of no consequence to
her since she now has an independent way of determining a successful
decryption.
Designing a good asymmetric cryptosystem is much harder than designing
a good symmetric cryptosystem. All of the known asymmetric systems
are orders of magnitude slower than corresponding symmetric systems.
Therefore, they are often used in conjunction with a symmetric cipher
to form a hybrid system. Here's how this works. Suppose
(E2, D2) is a 2-key cryptosystem and (E1, D1) is a 1-key
cryptosystem. To send a secret message m to Bob, Alice first
generates a random session key k for use with the 1-key
system. which she then uses to encrypt m. She then encrypts the
session key using Bob's public key for the 2-key system and sends Bob
both ciphertexts. In formulas, she sends Bob c1 = E1k(m) and
c2 = E2e(k). Bob decrypts c2 using D2d() to obtain k
and then decrypts c1 using D1k() to obtain m. This is much
more efficient than simply sending D2e(m) in the common case that
m is much longer than k.
Footnotes:
1We
often get sloppy with notation when discussing asymmetric
cryptosystems. Let k = (e,d) be a key pair. We sometimes write
ke and kd for the first and second keys, respectively, so we
might use the rather cumbersome notation Eke(m) and
Dkd(c). But then we might simplify this by dropping the
second-level subscripts to get the same notation we use for symmetric
cryptosystems, namely Ek(m) and Dk(c). Nevertheless, it should
still be understood that the "k" in Ek refers to the first
element of the key pair, whereas the "k" in Dk refers to the
second. In practice, it isn't generally as confusing as all this. but
the potential for misunderstanding is there.
File translated from
TEX
by
TTH,
version 3.66.
On 26 Sep 2005, 18:22.