Principles of Cryptography

Although cryptography has a long history dating back to Julius Caesar (we will look at the so-called Caesar cipher shortly), modern cryptographic techniques, including many of those used in today's Internet, are based on advances made in past twenty years. The books [Kahn 1967Singh 1999] provide a fascinating look at this long history. A detailed (but entertaining and readable) technical discussion of cryptography, particularly from a network standpoint, is [Kaufman 1995]. [Diffie 1998] provides a compelling and up-to-date examination of the political and social (e.g., privacy) issues that are now inextricably intertwined with cryptography. A complete discussion of cryptography itself requires a complete book [Kaufman 1995Schneier 1996] and so below we only touch on the essential aspects of cryptography, particularly as they are practiced in today's Internet. Two excellent on-line sites are [Kessler 99] and the RSA Labs FAQ page [RSA 1999c].

Cryptographic techniques allow a sender to disguise data so that an intruder can gain no information from the intercepted data. The receiver, of course must be able to recover the original data from the disguised data. Figure 7.2-1 illustrates some of the important terminology:


Figure 7.2-1: Cryptographic components

Suppose now that Alice wants to send a message to Bob. Alice's message in its original form (e.g., "Bob, I love you. Alice") is known as plaintext, or cleartext. Alice encrypts her plaintext message using an encryption algorithm so that the encrypted message, known as ciphertext, looks unintelligible to any intruder. Interestingly, in many modern cryptographic systems, including those used in the Internet, the encryption technique itself is known - published, standardized, and available to everyone (e.g., [RFC 1321RFC 2437,RFC 2420), even a potential intruder! Clearly, if everyone knows the method for encoding data, then there must be some bit of secret information that prevents an intruder from decrypting the transmitted data. This is where keys come in.

In Figure 7.2-1 Alice provides a keyKA, – a string of numbers or characters, as input to the encryption algorithm. The encryption algorithm takes the key and the plaintext as input and produces ciphertext as output. Similarly, Bob will provide a key KB, to the decryption algorithm, that takes the ciphertext and Bob's key as input and produces the original plaintext as output. In so-called symmetric key systems, Alice and Bob's keys are identical and are secret. In public key systems, the key that Alice uses is known to all (!), while Bob's key is secret. In the following two subsections, we consider symmetric key and public key systems in more detail.

Symmetric Key Cryptography

All cryptographic algorithms involve substituting one thing for another, e.g., taking a piece of plaintext and computing the appropriate ciphertext that forms the encrypted message. Before studying a modern key-based cryptographic system, let us first "get our feet wet" by studying a very old simple symmetric key algorithm attributed to Julius Caesar, known as the Caesar cipher (a "cipher" is a method for encrypting data).

For English text, the Caesar cipher would work by taking each letter in the plaintext message and substituting the letter that is k letters later (allowing wraparound, i.e., having the letter "a" follow the letter "z") in the alphabet. For example if k=4, then the letter "a" in plaintext becomes "d" in ciphertext; "b" in plaintext becomes "e" in ciphertext, and so on. Here, the value of k serves as the key. As an example, the plaintext message "bob, I love you. alice." becomes "yly, f ilsb vlr. xifzb." in ciphertext. While the ciphertext does indeed look like gibberish, it wouldn't take long to break the code if you knew that the Caesar cipher was being used, as there are only 25 possible key values.

An improvement to the Caesar cipher is the so-called monoalphabetic cipher that also substitutes one letter in the alphabet with another letter in the alphabet. However, rather than substituting according to a regular pattern (e.g., substitution with an offset of k for all letters), any letter can be substituted for any other letter, as long as each letter has a unique substitute letter and vice versa. Many newspaers in the US carry cryptographic puzzles based on this cipher. The substitution rule in Figure 7.2-2 shows one possible rule for encoding plaintext.

Figure 7.2-2: a monoalphabetic cipher

plaintext letter:

a

b

c

d

e

f

g

h

i

f

k

l

m

n

o

p

q

r

s

t

u

v

w

x

y

z

ciphertext letter:

m

n

b

v

c

x

z

a

s

d

f

g

h

j

k

l

p

o

i

u

y

t

r

e

w

q

The plaintext message "bob, I love you. alice." becomes "nkn, s gktc wky. mgsbc" Thus, as in the case of the Caesar cipher, this looks like gibberish. A monoalphabetic cipher would also appear to be better than the Caesar cipher in that there are 26! (on the order of 1026) possible pairings of letters rather than 25 possible pairings. A brute force approach of trying all 1026 possible pairings would require far too much work to be a feasible way of breaking the encryption algorithm and decoding the message. However, by statistical analysis of the plaintext language, e.g., knowing that the letters "e" and "t" are the most frequently occurring letters in typical text (accounting for 13% and 9% of letter occurrences), and knowing that particular two- and three-letter occurrences of letters appear quite often together (e.g., "in", "it", "the" "ion", "ing", etc.) make it relatively easy to break this code. If the intruder has some knowledge about the possible contents of the message, then it is even easier to break the code. For example, if Trudy the intruder is Bob's wife and suspects Bob of having an affair with Alice, then she might suspect that the names "bob" and "alice" appear in the text." If Trudy knew for certain that those two names appeared in the ciphertext and had a copy of the example ciphertext message above, then she could immediately determine 7 of the 26 letter pairings, since "alice" is the only five-letter word in the message, and "bob" is the only three-letter word that has an identical first and last letter. Thus, Trudy requires 109 fewer possibilities to be checked a by brute force method. Indeed, if Trudy suspected Bob of having an affair, she might well expect to find some other choice words in the message as well.

When considering how easy it might be for Trudy to break Bob and Alice's encryption scheme, one can distinguish three different scenarios, depending on what information the intruder has:

Five hundred years ago, techniques improving on monoalphabetic encryption, known as polyalphabetic encryption were invented. These techniques, incorrectly attributed to Blaise de Vigenere [Kahn 1967] have come to be known as Vigenere ciphers. The idea behind Vigenere ciphers is to use multiple monoalphabetic ciphers, with a specific monoalphabetic cipher to encode a letter in a specific position in the plaintext message. Thus, the same letter, appearing in different positions in the plaintext message might be encoded differently. The Vigenere cipher shown in Figure 7.2-3 has two different Caesar ciphers (with k=6 and k=20), shown as rows in Figure 7-2-3. One might choose to use these two Caesar ciphers, C1and C2, in the repeating pattern C1, C2, C2, C1, C2. That is, the first letter of plaintext is to encoded using C1, the second and third using C2, the fourth using C1, and the fifth using C2. The pattern then repeats, with the sixth letter being encoded using C1, the seventh with C2, and so on. The plaintext message "bob, I love you. alice." is thus encrypted "ghu, n etox dhz." Note that the first "b" in the plaintext message is encrypted using C1, while the second "b" is encrypted using C2. In this example, the encryption and decryption "key" is the knowledge of the two Caesar keys (k=4, k=20) and the pattern C1, C2, C1, C2, C2.

Figure 7.2-3: A Vigenere cipher using two Caesar ciphers

plaintext letter:

a

b

c

d

e

f

g

h

i

f

k

l

m

n

o

p

q

r

s

t

u

v

w

x

y

z

C1(k=6):

f

g

h

i

j

k

l

m

n

o

p

q

r

s

t

u

v

w

x

y

z

a

b

c

d

e

C2(k=20):

t

u

v

w

x

y

z

a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

p

q

r

s

Data Encryption Standard (DES)

Let us now fast forward to modern time and examine the Data Encryption Standard (DES) [NIST 1993] , a symmetric key encryption standard published in 1977 and updated most recently in 1993 by the US National Bureau of Standards for commercial and non-classified US government use. DES encodes plaintext in 64 bit chunks using a 64-bit key. Actually, 8 of these 64 bits are odd parity bits (one bit in each of the 8 bytes is an odd partity bit for that byte), so the DES key is effectively 56 bits long. The National Institute of Standards (the successor to the National Bureau of Standards) states the goal of DES as follows: " The goal is to completely scramble the data and key so that every bit of the ciphertext depends on every bit of the data and every bit of the key .... with a good algorithm, there should be no correlation between the ciphertext and either the original data or key." [NIST 1999].

The basic operation of DES is illustrated in Figure 7.2-4. In our discussion we will overview DES operation, leaving the nitty-gritty bit-level details (there are many!) to those wishing to consult [Kaufman 1995Schneier 1995] (with [Schneier 1995] including a C implementation as well). The DES consists of two permutation steps (the first and last steps of the algorithm) in which all 64 bits are permuted, and 16 identical "rounds" of operation in between. The operation of each round is identical, taking the output of the previous round as input. During each round, the rightmost 32 bits of the input are moved to the left 32 bits of the output. The entire 64-bit input to the ith round and the 48 bit key for the ith round (derived from the larger DES 56-bit ) are taken as input to a function that involves expansion of four-bit input chunks into six-bit chunks, exclusive OR-ing with the expanded six bit chunks of the 48-bit key Ki, a substitution operation and further exclusive OR-ing with the leftmost 32 bits of the input; see [Kaufman 1995Schneier 1995] for details. The resulting 32-bit output of the function is then used as the rightmost 32 bits of the rounds 64-bit output, as shown in Figure 7.2-4. Decryption works by reversing the algorithm's operations.


Figure 7.2-4: Basic operation of the DES

How well does DES work? How secure is it? No one can tell for sure, although recent speculation is that one could build a special purpose machine that exhaustively searched through the 56-bit key space for under a million dollars [Kaufman 1995]. In 1997, a network security company, RSA Data Security Inc, launched a DES Challenge contest to "crack" (decode) a short phrase it had encrypted using 56-bit DES. The unencoded phrase ( “Strong cryptography makes the world a safer place.”) was determined only 140 days later by a team that used volunteers throughout the Internet to systematically explore the key space. The team claimed the $10,000 prize after testing only a quarter of the key space - about 18 quadrillion keys [RSA 1997]. The most recent 1999 DES Challenge III was won in a record time of a little over 22 hours, with a network of volunteers and a special purpose computer that was built for less that $250,000 (nick-named "DES Cracker") and is documented on-line [EFF 1999].

If 56-bit DES is considered too insecure, one can simply run the 56-bit algorithm multiple times, taking the 64-bit output from one iteration of DES as the input to the next DES iteration, using a different encryption key each time. For example, so-called triple-DES (3DES), is a proposed US government standard [NIST 1999b] and has been proposed as the encryption standard for the Point-to-Point protocol [RFC 2420], PPP, for the data link layer (see section 5.7). A detailed discussion of key lengths and the estimated time and budget needed to crack DES can be found in [Blaze 1996].

We should also note that our description above has only considered the encryption of a 64-bit quantity. When longer messages are encrypted, which is typically the case, DES is often used with a technique known as cipher-block chaining, in which the encrypted version of the jth 64-bit quantity of data is XOR'ed with the (j+1)st unit of data before the (j+1)st unit of data is encrypted.

Public Key Encryption

For more than 2000 years (since the time of the Caesar cipher and up to the 1970's), encrypted communication required that the two communicating parties share a common secret – the symmetric key used for encryption and decryption. One difficulty with this approach is that the two parties must somehow agree on the shared key; but to do so requires (presumably secure) communication! Perhaps the parties could first meet and agree on the key in person (e.g., two of Caesar's centurions might meet at the Roman baths) and thereafter communicate with encryption. In a networked world, however, communicating parties may never meet and may never converse except over the network. Is it possible for two parties to communicate with encryption without having a shared secret key that is known in advance? In 1976, Diffie and Hellman [Diffie 1976] demonstrated an algorithm (known now as Diffie-Hellman Key Exchange) to do just that - a radically different and marvelously elegant approach towards secure communication that has led to the development of today's public key cryptography systems. We will see shortly that public key cryptography systems also have several wonderful properties that make them useful not only for encryption, but for authentication and digital signatures as well. The ideas begun with [Diffie 1976] have evolved, with a significant milestone being [RSA 1978], into the public key systems in use today.


Figure 7.2-5: Public key cryptography

The use of public key cryptography is quite simple. Suppose Alice wants to communicate with Bob. As shown in Figure 7.2-5, rather than Bob and Alice sharing a single secret key (as in the case of symmetric key systems), Bob (the recipient of Alice's messages) instead has two keys – a public key that is available to everyone in the world (including Trudy the intruder!) and a private key that is known only to Bob. In order to communicate with Bob, Alice first fetches Bob's public key. Alice then encrypts her message to Bob using Bob's public key and a known (e.g., standardized) encryption algorithm. Bob receives Alice's encrypted message and uses his private key and a known (e.g., standardized) decryption algorithm to decrypt Alice's message. In this manner, Alice can send a secret message to Bob without either of them having to have to distribute any secret keys!

Using the notation of Figure 7.2-5, for any message mdB(eB(m)) = m, i.e., applying Bob's public key then Bob's private key to the message m gives back m. We will see shortly that we can interchange the public key and private key encryption and get the same result, that is, eB(dB(m)) = dB(eB(m)) = m.

The use of public key cryptography is thus conceptually simple. But two immediate worries may spring to mind. A first concern is that although an intruder intercepting Alice's encrypted message will only see gibberish, the intruder knows both the key (Bob's public key, which is available for all the world to see) and the algorithm that Alice used for encryption. Trudy can thus mount a chosen plaintext attack, using the known standardized encryption algorithm and Bob's publicly available encryption key to encode any message she chooses! Trudy might well try, for example, to encode messages, or parts of messages, that she suspects that Alice might send. Clearly, if public key cryptography is to work, key selection and encryption/decryption must be done in such a way that it is impossible (or at least so hard to be impossible for all practical purposes) for an intruder to either determine Bob's private key or somehow otherwise decrypt or guess Alice's message to Bob. A second concern is that since Bob's encryption key is public, anyone can send an encrypted message to Bob, including Alice or someone claiming to be Alice. In the case of a single shared secret key, the fact that the sender knows the secret key implicitly identifies the sender to the receiver. In the case of public key cryptography, however, this is no longer the case since anyone can send an encrypted message to Bob using Bob's publicly available key. Certificates, which we will study in section 7.5, are needed to bind an entity (such as Bob) to a specific public key.

While there may be many algorithms and keys that have this property, the RSA algorithm (named after its founders, Ron Rivest, Adi Shamir, and Leonard Adleman) has become almost synonymous with public key cryptography. Let's first see how RSA works and then examine why it works. Suppose that Bob wants to receive encrypted messages, as shown in Figure 7.2-5. The are two inter-related components of RSA:

In order to choose the public and private keys, Bob must do the following:

The encryption by Alice, and the decryption by Bob is done as follows:

c = me mod n

m = cd mod n

which requires the use of his secret key, (n,d).

As a simple example of RSA, suppose Bob chooses p=5 and q=7 (admittedly, these values are far too small to be secure). Then n=35 and z=24. Bob chooses e=5, since 5 and 24 have no common factors. Finally, Bob chooses d=29, since 5*29 - 1 (i.e., ed −1 ) is exactly divisible by 24. Bob makes the two values, n=35 and e=5, public and keeps the value d=29secret. Observing these two public values, suppose Alice now wants to send the letters 'l' 'o' 'v' and 'e' to Bob. Interpreting each letter as a number between 1 and 26 (with 'a' being 1, and 'z' being 26), Alice and Bob perform the encryption and decryption shown in Figures 7.2-6 and 7.2-7, respectively:

Figure 7.2-6: Alice's RSA encryption, e=5, n = 35

plaintext
letter

m: numeric
representation

me

ciphertext
c = me mod n

l

12

248832

17

o

15

759375

15

v

22

5153632

22

e

5

3125

10

Figure 7.2-7: Bob's RSA decryption, d=29, n=35

ciphertext
c

cd

m = cd mod n

plaintext
letter

17

481968572106750915091411825223072000

12

l

15

12783403948858939111232757568359400

15

o

22

8.5164331908653770195619449972111e+38

22

v

10

100000000000000000000000000000

5

e

Given that "toy" example in Figures 7-7 and 7-8 has already produced some extremely large numbers, and given that we know that we saw earlier that p and q should each be several hundred bits long, several practical issues regarding RSA come to mind. How does one choose large prime numbers? How does one then choose e and d? How does one perform exponentiation with large numbers? A discussion of these important issues is beyond the scope of this book; see [Kaufman 1995] and the references therein for details.

We do note here that the exponentiation required by RSA is a rather time consuming process. RSA Data Security [RSA 1999b] says its software toolkit can encrypt/decrypt at a throughput of 21.6 Kbits per second with a 512-bit value for n and 7.4 Kbits per second with a 1024-bit value. DES is at least one hundred times fast in software and between 1000 and 10000 times faster in hardware. As a result, RSA is often used in practice in combination with DES. For example, if Alice wants to send Bob a large amount of encrypted data at high speed, she could do the following. First Alice chooses a DES key that will be used to encode the data itself; this key is sometimes referred to as a session key, KS. Alice must inform Bob of the session key, since this is the shared secret key they will use for DES. Alice thus encrypts the session key value using Bob's public RSA key, i.e., computes c = (KS)e mod n . Bob receives the RSA-encrypted session key, c, and decrypts to obtain the session key, KS.. Bob now knows the session key that Alice will use for her DES-encrypted data transfer.

Why does RSA work?

The RSA encryption/decryption above appears rather magical. Why should it be that by applying the encryption algorithm and then the decryption algorithm, one recovers the original message? In order to understand why RSA works, we'll need to perform arithmetic operations using so-called modulo-n arithmetic. In modular arithmetic, one performs the usual operations of addition, multiplication and exponentiation. However, the result of each operation is replaced by the integer remainder that is left when the result is divided by n. We will take n = pq, where p and q are the large prime numbers used in the RSA algorithm.

Recall that under RSA encryption, a message (represented by an integer), m, is first exponentiated to the power e using modulo-n arithmetic to encrypt. Decryption is performed by raising this value to the power d, again using modulo n arithmetic. The result of an encryption step, followed by a decryption step is thus (me)d . Let's now see what we can say about this quantity. We have:

(me)d mod n = med mod n

Although we're trying to remove some of the "magic" about why RSA works, we'll need to use a rather magical result from number theory here. Specifically, we'll need the result that says if p and q are prime, and n = pq, then xy mod n is the same as x(y mod (p−1)(q−1)) mod n [Kaufman 1995]. Applying this result, we have

(me)d mod n = m (ed mod (p−1)(q−1)) mod n

But remember that we chose e and d such that ed −1 is exactly divisible (i.e., with no remainder) by (p−1)(q−1), or equivalently that ed is divisible by (p−1)(q−1) with a reminder of 1, and thus ed mod (p−1)(q−1) = 1. This gives us

(me)d mod n = m 1mod n = m

i.e., that

(me)d mod n = m.

This is the result we were hoping for! By first exponentiating to the power of e (i.e., encrypting) and then exponentiating to the power of d (i.e., decrypting), we obtain the original value, m. Even more remarkable is the fact that if we first exponentiate to the power of d and then exponentiate to the power of e, i.e., we reverse the order of encryption and decryption, performing the decryption operation first and then applying the encryption operation, we also obtain the original value, m! (The proof for this result follows the exact same reasoning as above). We will see shortly that this wonderful property of the RSA algorithm,

(me)d mod n = m = (md)e mod n

will be of great use.

The security of RSA relies on the fact that there are no known algorithms for quickly factoring a number, in this case the public value n, into the primes p and q. If one knew p and q, then given the public value e, one could then easily compute the secret key, d. On the other hand, it is not know whether or not there exist fast algorithms for factoring a number, and in this sense the security of RSA is not "guaranteed."