1A Review of Literature Concerning RC4andAn Improved Cryptosystem Based Upon RC4Michael LeeComputer Science Dept., University of California at Santa Barbarakirbysdl@cs.ucsb.eduAbstract. This paper discusses important features and issues surrounding the RC4 encryption algorithm. I begin by introducing the study of stream ciphers as an important field of cryptography. I then proceed to examine the RC4 algorithm itself as well as a popular implementation of the algorithm known as Ciphersaber. In particular, I review and analyze security concerns of both RC4 and Ciphersaber. I conclude after proposing and briefly discussing a modified Ciphersaber cryptosystem that works around the known security concerns of RC4.1IntroductionCryptographic algorithms can be broken into several major categories. First, they can be classified as public-key or private-key cryptosystems. Private-key systems are further separated into stream ciphers and block ciphers. Block ciphers generally encrypt a large block of plaintext at a time, whereas stream ciphers encrypt one datum at a time. Stream ciphers are popular for several reasons. The first stream cipher, known as a one-time pad, [Ver26] is the only cryptosystem that has been proven to be unbreakable [Sha49].However, the one-time pad had serious key management and distribution problems, so it was abandoned. Since then, stream ciphers have attempted to duplicate the one-time pad’s security while at the same time avoiding its key management problems.Stream ciphers are also generally much faster than block ciphers because of the way they operate. The keystream for a stream cipher is oftentimes generated independently of the plaintext. In fact, the keystream can even be generated prior to the encryption step. During encryption, each element of the keystream is combined with each element of the plaintext to produce the ciphertext. The combining function used is usually the exclusive or operator (XOR). In the past, stream ciphers usually implemented a combination of linear feedback shift registers (LFSRs). However,
2attacks have been developed that will break most systems of LFSR-based stream ciphers. 2The RC4 Stream CipherRC4, Which stands for “Ron’s Code 4,” [RivFAQ] is an example of a stream cipher that does not use LFSRs. Ronald Rivest designed it for RSA Data Security, Inc. (RSADSI) in 1987. It was, and is still considered by RSADSI to be confidential and proprietary. However, in 1994 an anonymous contributor posted an algorithm to the Cypherpunks mailing list that has since proven to be compatible with RC4 [Cyp94]. Henceforth, this alleged RC4 algorithm will be referred to as RC4, because it is overwhelmingly likely that it is the same as the official RC4 algorithm. The legal status of RC4 isthe source of some debate [Sch96] [Rei97]. There are two ways to protect an invention: either patent it or keep it a trade secret. Keeping an invention as a trade secret is dangerous because it can be leaked, reverse-engineered, or otherwise discovered by the public. Since RC4 has apparently been leaked, it can no longer be considered a trade secret. However, RSADSI would most likely attempt to sue anyone who uses RC4 in a commercial application without a license, and the cost of litigation would probably outweigh the cost of obtaining a license.RC4 is a very popular stream cipher. First of all, it is extremely fast. Optimized versions of RC4 can encrypt over 20MB of data every second on a 150MHz Pentium [SW97]. In addition it is simple enough to code from memory and it is still unbroken after over a decade. It is used in Lotus Notes and Oracle Secure SQL [Sch96], the Secure Sockets Layer and Secure Shell protocols, and dozens of other applications. It is small both in code size because of its simple algorithm and in memory footprint because of its 256-byte state table. Lastly, it is also popular because it is unusual. RC4 is one of the few strong stream ciphers that do not use LFSRs.3The RC4 AlgorithmLike most stream ciphers, RC4 generates a keystream from its internal state. The cipher operates in OFB mode, which means that the next state is derived from the current state and the current keystream byte [RSAFAQ]. RC4 uses an array of 256 bytes containing a permutation of the integers 0 through 255 as its internal state. Its operation has two phases: the setup phase and the encryption/decryption phase. All math is done modulo 256, which is the size of the state table. During the setup phase, the state table is first initialized such thatstate[i]=i i=[0, 255]. This is done in lines 107-108 (please refer to the attached source code for line numbers). Next, the key is mixed into the state table in lines 114-117. For each of 256 iterations, i is incremented (line 114). Then, j gets incremented by the ithelement of the state and the ithelement of the key in line 115. Notice the modulus operator causes the key to effectively wrap around on itself, so that a key of length less than 256 may be used. Finally, the elements indexed by i andj are swapped in line 116.