User-Server Interaction: Authentication and Cookies
We mentioned above that an HTTP server is stateless. This simplifies server design, and has permitted engineers to develop very high-performing Web servers. However, it is often desirable for a Web site to identify users, either because the server wishes to restrict user access or because it wants to serve content as a function of the user identity. HTTP provides two mechanisms to help a server identify a user: authentication and cookies.
Many sites require users to provide a username and a password in order to access the documents housed on the server. This requirement is referred to as authentication. HTTPprovides special status codes and headers to help sites perform authentication. Let us walk through an example to get a feel for how these special status codes and headers work. Suppose a client requests an object from a server, and the server requires user authorization.
After obtaining the first object, the client continues to send the username and password in subsequent requests for objects on the server. (This typically continues until the client closes his browser. However, while the browser remains open, the username and password are cached, so the user is not prompted for a username and password for each object it requests!) In this manner, the site can identify the user for every request.
We will see in Chapter 7 that HTTP performs a rather weak form of authentication, one that would not be difficult to break. We will study more secure and robust authentication schemes later in Chapter 7.
Cookies are an alternative mechanism for sites to keep track of users. They are defined in RFC 2109. Some Web sites use cookies and others don't. Let's walk through an example. Suppose a client contacts a Web site for the first time, and this site uses cookies. The server's response will include a Set-cookie: header. Often this header line contains an identification number generated by the Web server. For example, the header line might be:
Set-cookie: 1678453
When the the HTTP client receives the response message, it sees the Set-cookie: header and identification number. It then appends a line to a special cookie file that is stored in the client machine. This line typically includes the host name of the server and user's associated identification number. In subsequent requests to the same server, say one week later, the client includes a Cookie: request header, and this header line specifies the identification number for that server. In the current example, the request message includes the header line:
Cookie: 1678453
In this manner, the server does not know the username of the user, but the server does know that this user is the same user that made a specific request one week ago.
Web servers use cookies for many different purposes:
We mention, however, that cookies pose problems for mobile users who access the same site from different machines. The site will treat the same user as a different user for each different machine used. We conclude by pointing the reader to the page Persistent Client State HTTP Cookies, which provides an in-depth but readable introduction to cookies. We also recommend Cookies Central, which includes extensive information on the cookie controversy.