By storing previously retrieved objects, Web caching can reduce object-retrieval delays and diminish the amount of Web traffic sent over the Internet. Web caches can reside in a client or in an intermediate network cache server. We will discuss network caching at the end of this chapter. In this subsection, we restrict our attention to client caching.
Although Web caching can reduce user-perceived response times, it introduces a new problem – a copy of an object residing in the cache may be stale. In other words, the object housed in the Web server may have been modified since the copy was cached at the client. Fortunately, HTTP has a mechanism that allows the client to employ caching while still ensuring that all objects passed to the browser are up-to-date. This mechanism is called the conditional GET. An HTTP request message is a so-called conditional GET message if (i) the request message uses the GET method and (ii) the request message includes an If-Modified-Since:header line.
To illustrate how the conditional GET operates, let's walk through an example. First, a browser requests an uncached object from some Web server:
GET /fruit/kiwi.webp HTTP/1.0
User-agent: Mozilla/4.0
Accept: text/html, image.webp, image.webp
Second, the Web server sends a response message with the object to the client:
HTTP/1.0 200 OK
Date: Wed, 12 Aug 1998 15:39:29
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Jun 1998 09:23:24
Content-Type: image.webp
data data data data data …
The client displays the object to the user but also saves the object in its local cache. Importantly, the client also caches the last-modified date along with the object. Third, one week later, the user requests the same object and the object is still in the cache. Since this object may have been modified at the Web server in the past week, the browser performs an up-to-date check by issuing conditional GET. Specifically, the browser sends
GET /fruit/kiwi.webp HTTP/1.0
User-agent: Mozilla/4.0
Accept: text/html, image.webp, image.webp
If-modified-since: Mon, 22 Jun 1998 09:23:24
Note that the value of the If-modified-since: header line is exactly equal to value of the Last-Modified: header line that was sent by the server one week ago. This conditional GET is telling the server to only send the object if the object has been modified since the specified date. Suppose the object has not been modified since 22 Jun 1998 09:23:24. Then, fourth, the Web server sends a response message to the client:
HTTP/1.0 304 Not Modified
Date: Wed, 19 Aug 1998 15:39:29
Server: Apache/1.3.0 (Unix)
(empty entity body)
We see that in response to the conditional GET, the Web server still sends a response message, but it doesn't bother to include the requested object in the response message. Including the requested object would only waste bandwidth and increase user perceived response time, particularly if the object is large (such as a high resolution image). Note that this last response message has in the status line 304 Not Modified, which tells the client that it can go ahead and use its cached copy of the object.