February 12th, 2008
I suppose this post would be more prophetic a decade or two ago. It was in the 90’s that the HTTP protocol really became the Great protocol. It is foundation of the World Wide Web, and is language on which browsers were able to really open the doorway to the Internet for us. So am I little behind the times to suggest that HTTP now has an emerging future relevance? Is HTTP a relic of the past or does it have something to contribute to the future?
One of the distinctives of current Internet technological advance is in the growing realm of open sharing and utilizing data from disparate sources. Facilitating this progress is one of the principle goals of my work and this site. I want the Open Web to be more than just a bunch of pages that are developed without proprietary constraints, but for the Open Web to be the environment for open flow of information with intelligible interconnection of data that can give participants unprecedented leverage and permutations of capabilities. Mashups are a buzzword to describe this process. However, in order for information to flow rapidly, there must be commonly understood communication. JSON has enormous potential because it is so simple, expressive, and pervasive that it forms a excellent syntax for expressing data. However, JSON is not a transport. Two agents that wish to dialogue may understand JSON data, but they still need a mechanism to communicate and transfer that information. HTTP is almost ubiquitously the right choice for the transport. The incredible adoption of HTTP is main reason for this. No transport is more widely understood
What is wrong with HTTP?
Before going into the benefits of HTTP, let us look at the problems with the HTTP, or more specifically, THE problem with HTTP. The most fundamental problem with HTTP is that it requires that every single response must be preceding by one corresponding request. The specification describes HTTP as request/response protocol. This constraint has an enormous impact on the capabilities of HTTP. The first major impact is in hindering performance optimizations. In order to load a web page, every resource must be requested before the server can send the resources. This creates a signficant latency problems. There can be large gaps in transfers while servers are waiting to receive requests. However, a server could easily determine the most likely resources that a user agent will need and send them before the request if this constraint was not in place.
This constraint is also the fundamental cause of consternation with Comet development. Comet consists of efforts to allow servers to send messages to clients asynchronously instead of in immediate response to a request. Doing this requires creating an outstanding HTTP request that the server can respond to when it wants to. Comet push capabilities could easily be achieved if servers could simply send messages to the client without requiring a preceding request.
Throwing away the entire protocol because of a single issue is absurd, rather let’s fix or enhance the protocol. In recent articles I have discussed how non-request/response-bound HTTP messages can be sent within HTTP messages in order to deal with this problem using existing infrastructure.
So why is HTTP the right choice for future information transport?
It is so pervasive – HTTP is everywhere. It is how all browsers communicate with web servers. HTTP is understood by an overwhelming amount of software. Attempts to reinvent the functionality and capabilities of HTTP is essentially asking for this broadly understood language to be ignored in lieu of new one. Without very significant advantageous to new semantics and vocabulary, such attempts are generally either doomed to obscurity or worse, a cause in division in multiple semantics for the same thing, causing increased code complexity and costs.
It has the constructs for tomorrow – With ever increasing interchange of data in the future, more sophisticated and robust techniques for communicating data are needed. Many of these techniques already exist in HTTP, but have simply not yet been needed with yesterday’s technology. The future of high performance, intelligent data interchange will hinge on capabilities that already exist in HTTP including:
Content negotiation – HTTP includes vocabulary for negotiating between different formats.
Partial data transfers – HTTP has an extensible mechanism for sending a range of information.
Robust error handling – HTTP includes a comprehensive set of errors.
Parallel scalability – Perhaps one of the most impressive features of HTTP is how carefully designed such that demand can be easily scaled across numerous machines as HTTP proxy servers.
REST/CRUD semantics – HTTP provides semantics for basic create, read, update, and delete operations.
Performance improvements – HTTP pipelining has only begun to be utilized (only Opera has it turned on by default). Substantial performance improvements can be realized through pipelining.
Emerging technologies give us new leverage with HTTP – With traditional web application development, much of the workings of HTTP were hidden away by the browser and the server. However, the Ajax revolution gives developers far greater control of HTTP messaging. Most Ajax developers have simply used XMLHttpRequest as a means for communicating simple payloads of data back and forth to server, but XHR has given developers new access to the HTTP capabilities through their header metadata, and leverage to utilize the full semantics of HTTP for more meaningful communication.
Furthermore, with new XHR capabilities coming soon (FF3 will have cross site XHR support), XHR communication will involve much more than simply communicating with your own server. Communicating with your own server does not necessarily require widely understood vocabulary, but when communicating with other servers, the cost and efficiency of integration will be directly related to how much shared vocabulary can be utilized to provide jointly understood communication. New ways to utilize, extend, and leverage HTTP are being developed as well like the Atom Publishing Protocol.
Does it really matter what is on the wire? Can’t we simply distribute API based communication handlers? Consider this, is it easier to setup a TV to tune into the available TV stations, or is it easier to setup a printer to work with your new operating system? TV stations have standardized on a single format for broadcasting content. Connecting a TV to a station is as simple as turning it on choosing your station. On other hand, printers have no standardized communication with servers. Devices like printers use API based communication handlers (AKA drivers). With a huge number of different protocols for each printer, and different operating systems to interact with, there are an enormous amount of permutations of different drivers that must be developed, very prone to incompatibilities. While the situation has improved, many have experienced the frustrating effort that can go into trying to find the right driver for your OS/printer combination.
We need to be examining the HTTP specifications and learning how to leverage the power that is available, to maximize our potential in coming world of extensive data interchange and mashups. Next time you need to create Ajax communication to trigger CRUD operations, consider using the RESTful HTTP methods (PUT, POST, and DELETE). Do you need to deliver multiple formats of the resource? Consider using HTTP content negotiation. Do you want Comet capabilities in your application? Consider using an HTTP standards based approach. The more we can utilize what is there, the more widely we can be understood, and the more efficiently we can utilize the infrastructure of the web. The full utilization of HTTP can provide a solid foundation for the future of data interchange.