A basic overview of the HTTP Protocol

This post gives a basic overview of the HTTP protocol, including its theory of operation. It also briefly describes the concept of HTTP Persistency, Pipelining, Proxies and HTTP cookies

HTTP (Hyper Text Transfer Protocol) is the most popular protocol used for web browsing.  It is basically a computer networking application layer protocol provided to the applications for accessing data on the world wide web (www).
A sample HTTP request and response
A sample HTTP request and response
HTTP Basic Theory of Operation
  • HTTP is a standard text based, application layer protocol, used by all browsers to access millions and millions of web pages, stored across the entire globe.
  • It is similar to FTP in some aspects as it uses TCP as the underlying transport layer protocol to transfer files and supports methods like get and put for data transfer. However, HTTP uses just a single TCP connection compared to two TCP connections used by FTP (one control and one data). HTTP is also similar to SMTP in the structure of protocol messages.
  • HTTP is a reliable protocol, making sure that all data transferred through it reaches the peer machine without any loss. Due to this reliability requirement, HTTP uses TCP as the transport layer protocol.
  • It is a simple Client-Server REQUEST-REPLY protocol, where clients send HTTP requests and servers respond with HTTP replies.
  • HTTP is a stateless protocol as each HTTP Requests and Replies are treated independently by the client and server. So server does not maintain any specific state about each HTTP transaction.
  • HTTP supports multiple basic operations in the form of different HTTP methods like GET/PUT/POST/HEAD etc. The functions of some of the basic HTTP methods are given in the diagram below:
Different Types of HTTP Methods
Different Types of HTTP Methods
  • HTTPS is a secured version of the protocol. It uses SSL protocol to send encrypted data

Methods to speed up HTTP transfers

HTTP Persistent Connections :

A HTTP reply usually consists of multiple objects (text, multiple image files etc.). In a non-persistent HTTP connection, a separate TCP connection is used for transferring each object, whereas in a persistent HTTP connection, a single TCP connection is used for transferring multiple objects, one after the other.  With persistent connections,  the HTTP server does not immediately close the TCP connection after sending a HTTP response in response to the initial HTTP request for a web page.

A Persistent HTTP connection
A Persistent HTTP connection

This way, further HTTP requests and responses can reuse the same TCP connection, for transfer of other objects belonging to the same web page . HTTP 1.1 supports persistent connections by default.

HTTP Pipelining

Since each web page consists of multiple objects, a separate HTTP request has to be sent by the client to the server for getting each object. Without the HTTP pipelining feature, clients issue new request only when previous response has been received. To expedite the transfer process, the HTTP pipelining feature was introduced, where client sends HTTP requests as soon as it encounters a new referenced object in the web page. With pipelining, clients send new HTTP requests without waiting for HTTP responses of previous requests, thereby optimally using the network resources and expediting the web page tranfer, by sending multiple requests simultaneously.

HTTP Proxy

HTTP proxies are intermediate machines that are stored closer to the HTTP clients, to expedite HTTP tranfers. HTTP proxies locally cache frequently accessed pages and serve multiple clients. The HTTP proxy caches a copy of the HTTP reply, whenever a client machine accesses a new web page. Subsequently, if a HTTP request is sent for the same web page (either by the same client or by a different client), the HTTP proxy checks with the HTTP server whether the web page version held by it is latest. If yes, the HTTP proxy sends back the HTTP response to the end client using its cached copy. Otherwise, it gets the updated page from the actual web server, updates its local cache and then replies to the client. HTTP proxies thereby serves the double purpose of conserving network bandwidth and also expediting web page tranfers.

HTTP Cookies

A HTTP cookie is basically an entity used by HTTP servers to track a specific user. When a new user or a new computer accesses a web page for the first time, the web server creates a cookie for that user and sends it back as part of the HTTP reply. The cookie is a collection of information about that user (like name, address, domain name etc.). The cookie sent by the server is stored in the local browser cache by the browser. Whenever the user accesses the same HTTP server again, the browser additionally sends the cookie as an additional parameter in the HTTP Request. The server uses this cookie to uniquely identify the user and may customize the HTTP Reply (web page contents) based on that user’s preferences. Cookies can be used for multiple purposes like tracking user’s preferences and serving web content adaptively, giving access to web pages only for authenticated users, for electronic shopping etc.

 

How TFTP works?

This post gives a basic overview of the TFTP protocol, including its benefits and its theory of operation.

Trivial File Transfer Protocol (TFTP) is a simple light weight file transfer protocol, used for transferring  files over the network. This protocol is similar to FTP but supports much lesser features and hence comes with a smaller foot print.

What TFTP provides

  • Faster file transfer, as it uses UDP as the transport layer protocol
  • Lesser Code size or foot print
  • Ascii and binary modes of file transfer

What TFTP does not provide

  •  does not provide authentication
  •  does not support a rich set of user interface commands

Use of TFTP

  • TFTP is mainly used during device bootstrap process for downloading device OS/firmware and configuration files. It is typically used for copying bootstrap and configuration files between nodes belonging to the same LAN.
  • TFTP is used in situations where all the features of a full file transfer protocol like FTP are not needed.
  • It is used along with boot protocols like BOOTP and DHCP to initialize devices. Whenever an IP enabled node boots up, it gets its IP address and other device and network related parameters through BOOTP or DHCP. As part of these parameters, the client also receives the TFTP server address, bootstrap file and configuration file details (file name and directory location). The client then uses the TFTP protocol to download the bootstrap image and configuration files from the TFTP server.

Basic Theory of Operation

  • TFTP is a client-server, application layer protocol, with TFTP clients running theTFTP client software and TFTP servers running the TFTP server software.
  • TFTP uses UDP as the underlying transport layer protocol. Since UDP is much simpler when compared to the complicated TCP, it requires much lesser code space and hence TFTP can fit even inside small boot ROMs. 
  • TFTP servers waits on the well known UDP port number 69. A TFTP client, that wishes to send or receive files from the server, establishes a UDP connection to the server, by opening a UDP socket to the server’s IP address on port 69.
  • The TFTP client then sends a read request (RRQ) to the server if it wants to get a file or sends a write request (WRQ) if it wants to transfer a file onto the server.
  • TFTP splits a file, to be transferred, into blocks of size 512 bytes and transfers it as TFTP DATA messages. Each TFTP DATA block is numbered and carried inside separate UDP messages.
  • The last block of a file is always sent with a size lesser than 512. When the peer receives a block with size less than 512 bytes, it treats that block as the last block of the file that is being transferred. Even if the file size happens to be an exact multiple of 512 bytes, TFTP sends a block with zero bytes as the final block, to indicate to the peer that the file transfer is over.
  • Reliability : Each block is numbered and sent inside a separate UDP message. Since TFTP uses UDP, reliabile delivery of each block is not guaranteed by the underlying network protocols. So, TFTP itself takes care of reliability by requiring the peer to acknowledge each successfully received block.
  • Flow Control: TFTP sends data block by block. After sending a block, the sending end starts a block timer. If an acknowledgment is received for the block from the peer before the timer expires, then the next block of the file is sent. Otherwise, the current block is resent as soon as the block timer expires and the whole process repeats itself till the block is successfully acknowledged. Hence, TFTP is basically a stop and wait protocol and flow control is achieved by the sender sending atmost one outstanding block at any instant of time. 

TFTP messages

The TFTP protocol has basically 5 types of messages as given in the diagram below:

TFTP Message Types

  • The RRQ and WRQ messages are used by the client to request the server to start reading or writing a file respectively. Both these messages send the file name and transfer mode (ascii or binary) as additional parameters.
  • The DATA messages carry the actual file blocks, with each message carrying a block of data. Each block has a sequence number field indicating the block number.
  • The ACK message is used to acknowledge successfully received data blocks. It has the sequence number as the additional parameter, indicating the block number that was successfully received. Whenever a block is received error free (indicated by the UDP checksum), then the receiving TFTP node immediately acknowledges the block to the peer, by sending an ACK message.
  • The ERROR message is sent to the peer whenever some operation could not be performed (e.g. invalid file name, file does not have read/write permissions etc).
  • TFTP protocol has been enhanced to allow for additional option negotiations like initial sequence number, block size etc.