Browsers (and other User Agents) talk to web servers using the HTTP protocol.
An HTTP request consists of a header and an optional body.
The request header begins with single line containing an HTTP request method, followed by a URL, followed by an HTTP protocol number.
Example:
GET /stories/top/topstory.html HTTP/1.0
The GET method is used for most requests. Other commonly used methods include POST and HEAD.
After the first line of the request, several request header lines may follow, of the form
attribute-name: attribute-value
These header lines are used to transmit information about what browser version is issuing the request, as well as any cookies that have been previously sent to the browser from this web server.
After all of the header lines have been transmitted, a blank line is sent.
For some verbs, such as POST, the request may include a body with additional information (such as data submitted in a form). That information is sent after the blank line.
Here is a sample HTTP request, sent from a Mozilla browser when requesting the URL http://www.cs.bju.edu/portal/index.asp:
GET /portal/index.asp HTTP/1.1
Host: www.cs.bju.edu
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,
video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: ASPSESSIONIDQARQQDQA=GGELAOOBANPOLIJMBKCGONCI; ASPSESSIONIDAASSTCTD=PBJBCOOBAPLFMNFJCKEMFFLF
After the browser sends its request, it waits for the web server to respond.
The web server sends an HTTP response.
The first line contains an HTTP response code consisting of an HTTP version number, a status code, and a textual description.
Following the response code, the server sends a series of response headers, in a format similar to request headers. Some important headers include content-type and content-length.
After the headers comes a blank line. Then, the server transmits the body of the response.
Here is a sample HTTP response, sent from an IIS server in response to the previous request for URL http://www.cs.bju.edu/portal/index.asp:
HTTP/1.x 200 OK
Server: Microsoft-IIS/5.0
Date: Sat, 17 Jan 2004 15:07:20 GMT
X-Powered-By: ASP.NET
Content-Length: 7262
Content-Type: text/html
Cache-Control: private
Common response codes include:
200 (Ok): the server was able to successfully retrieve the requested resource.
404 (Not found): the server cannot find the requested resource. This is the response a browser receives when a user clicks a “broken link”. The server usually includes an HTML message in the response describing the problem and explaining how the user might be able to find what he is looking for.
302 (Resource moved): tells the browser that the resource has been moved to a different location. The server also returns the new URL in the response headers, and the browser then automatically attempts to load the new URL. The user never sees anything to indicate that the resource has been moved.
A URL often refers to a document containing HTML, but it might also be an image, a sound clip, a movie, or other resource.
When the web server transmits the requested resource, it tells the browser what kind of content the response body contains using the content-type response header. This header specifies the MIME type of the content.
RFC 2046 specifies the format of MIME types, and the IANA website maintains a registry of all registered MIME types, including text/plain, text/html, image/gif, image/jpeg, etc. The web server usually uses the file name extension (.html, .txt, .gif, etc.) to determine the MIME type. Files of type .exe, as well as files whose extension is unrecognized, are assigned a MIME type of application/binary-octet-stream. When browsers receive these files, they usually present a “save file?” dialog.
When a user submits a form,