kablooie
To share , learn and express my views. (mostly technical)
Methods for asynchronous communication over HTTP
This post was long overdue….was in my draft folder for almost an year now…
Following my post on WebSockets availability in Chrome, I looked around to see what are the options for asynchronous communication over HTTP without WebSockets. Below, I list the available approaches prior to WebSockets and I will also discuss the pros and cons of these approaches.
We know that HTTP is a synchronous request/response application protocol and in which a client always initiates a request and the server responds to the request; the server never initiates a request to the client. This communication pattern limits HTTP applications to synchronous “pull” only interaction pattern. However, there are several approaches that emulate asynchronous communication over a synchronous communication channel. I briefly explain each method with their pros and cons below:
Polling ( also called Pull Technology)
Polling requires the client to send request to the server at regular intervals to check if there are any updates. Server events can be queued and delivered to the client on each request from the client emulating server initiated communication. The polling interval can be fixed and determined based on the application requirement or the server can indicate polling interval in its responses.
- Advantages
- Simplifies server implementation by not requiring the server to maintain a list of subscribers and by shifting the responsibility of guaranteed delivery to consumers.
- Stateless interaction
- Disadvantages
- True real time communication not possible and is bound to the poll interval.
- Uses up server resources unnecessarily
- Will have to strike a balance between shorter and longer poll intervals to improve efficiency
- Increased network traffic (can be reduced by using conditional GET
Long Polling
This is a variation of the polling approach in which a client sends a request to the server and the server sends a response only if there is information to be sent in the response. If the server determines the requested resource is not changed, it holds on to the request until the resource is changed or if the specified timeout expires. When the server sends out the response either due to updates in the requested resource or due to timeout, the client issues another request and whole cycle repeats i.e. the server always has a pending request from the client.
- Advantages
- Better real time communication compared to normal polling because the information is sent as soon as it is available.
- Disadvantages
- Open HTTP connections consume system resources.
- Each request will a require thread that blocks the requests until data is available or till timeout. If there are multiple requests, this will result in large number of threads blocking for data.
- Server side support for asynchronous request processing (Asynchronous I/O) is required for scalability.
- No standard mechanism available. Each environment has its own mechanism to handle long polling.
Streaming
HTTP streaming is similar to long polling except that the connection is never closed even after the data is pushed to the client by the server. In this approach, a client sends a request to the server and the server responds to the request by sending data back to the client. However, after the data is sent, the server does not close the connection but instead keeps the connection open and pushes data to the client whenever available over the same connection.
- Advantages
- Allows sending multiple data transmission over single connection and results in fewer requests to the server.
- No connection over head.
- Real time updates possible.
- Disadvantages
- No standard mechanism available. Each environment has its own mechanism to handle streaming.
- Might not work through proxies/firewalls
- Might result in large number of open connections when there are multiple requests.
WebHook
WebHook provides a HTTP callback via HTTP POST when an event occurs. A Web application implementing WebHook will POST a message to a specified URI as an event notification. It also allows clients to register custom URIs with Web application which then POST messages to these URIs to indicate state changes. The notification URI should point to HTTP server.
- Advantages
- Provides near real time updates.
- Standardization attempts via WebHook and PubHubSubHub initiatives
- Polling is eliminated.
- Fully asynchronous
- Disadvantages
- Might not work through proxies/firewalls
- Requires the receiver to host a HTTP server.
Custom Media types in REST
Though most of REST was easy to grok, the HATEOAS constraint and the use of media types in REST posed a huge learning curve for me. So while deciding whether to use custom media type or stick with application/xml or use a standard media type, I had a hard time to choose. What I realized after many months of wading through articles , posting questions in discussion forums, trying to write clients against different respresentations is that though there are standard media types out there (text/html, atom etc.) , there are cases where just standard media types won’t suffice to application’s requirements. In these scenarios, defining a custom media types to capture and convey a consistent vocabulary to suit an application domain is not only good but IMO, a necessity if we want to get all the benefits of REST.
For our RESTful application, we started off defining a domain specific XML vocabulary but ended with a generic XML vocubulary to describe resources. One of the major goal of vocabulary definition was to have the ability to describe all resources in uniform and consistent manner. Initially we started designing the vocabulary around the semantics of each resource and found out soon that it was harder to mantain. We then decided to make a generic “resource” vocabulary i.e. the vocabulary was aimed at to be applicable to a large set of resources as possible. We also based our design on the needs of the client rather than around the specific resource. The design considered that anything exposed by our application is a resource which can be linked and provides links to other relevant resources. We have also tried to follow the basic software engineering principle of “Applying the software engineering principle of generality to the component interface”.
As per REST, a Resource is defined as:
“The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. “today’s weather in Los Angeles”), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author’s hypertext reference must fit within the definition of a resource”
Thus a resource is the fundamental building block in a RESTful system. A resource can have properties of any type (xsd: or custom) and can also contain resources. Based on these assumptions, we have come up with a generic XML vocabulary to describe all the resources in our system. I will post the vocabulary in my later posts.
Science & Technology 10K Run 2010 @ IISc
I ran my first 10K run at the SnT 10K event organized in Indian Institute of Science. Though I didn’t come in the first 3 or even first 20, I was able to finish the entire 10K without stopping even to take a sip of water. I think I could have gone another 5K without much effort. I ran the entire 10K following the pose running method. Though I didn’t feel the many benefits claimed by the method, I sure can confirm that it is very light on the knees and your calf gets really exercised and feels stronger. After the 10K run, I felt rejuvenated and and I am looking forward to the Sunfeast World 10K Bangalore on 23rd May 2010.
Multi-Language online IDE
An online IDE (ideone.com) that supports compiling and running software programs written in various languages showcases the idea of SaaS where compilers and runtime environments are provided as a service. I think this is an extremely useful service and will be helpful for trying out snippets of programs without the need to find and install compilers and runtime for your particular environments. This, however doesn’t do away with the need for local development environment since I do not think we can use this service to develop software applications. The service already supports many popular programming languages such as C++,C, Java, Python, C# etc with syntax highlighting. So if you find a sample code in a particular language and wish to compile and see the output, you just submit the code to the service and you see the results in a jiffy. Of course you cannot run code that needs the graphics environment. The service supports only programs that are command oriented or headless but nevertheless, a useful service I think.
Guess + estimating = guesstimating
Preparing for a Marathon
What happens when you delete a wave?
So what happens when you delete a wave in Google Wave? I tried it and found that it is almost impossible to do that even though you are the creator of the initial wave. When you have created a wave and added participants to the wave, deleting just moves the wave to your trash bin and it stay alive there!! i.e. it still gets updates when other participants of the wave interact with it. I haven’t found a way to remove the wave from the trash so far. This means that once a wave is created, it stays in the system forever?? I checked to see how long Google keeps my information? and found the following at Google’s privacy policy for wave and this what they have to say:
A Wave is queued for deletion from our system once all participants who have access to the Wave remove themselves from the Wave. Residual copies of deleted Waves may take up to 60 days to be purged from our servers and may remain in our offline backup.
Even though there is only one copy of a wave in the system and even though you are the creator of the wave, it is not easy to delete it. You got to remove all participants but there is no way you can do that except ask all participants to remove themselves (which is currently not implemented). So currently the only way is to ask all participants to delete the wave and wait for 60 days for it to be removed from system but still there is no guarantee that it will removed from Google’s backup store!
WebSockets now available in Chrome
The Google Chrome developer channel release 4.0.429.0 now supports HTML5 WebSockets. This is great news for people who want real time asynchronous communication on web and were till now relying on polling or other non standard mechanisms. WebSockets is a new bi-directional communication channel that is part of HTML5 specification. WebSockets allows full duplex communication that operates over a single TCP/IP connection. The WebSockets API is exposed via JavaScript and built-in support exists in any HTML5 complaint Web browsers. I hope more browsers start supporting this standard so we can finally say goodbye to non standard mechanisms such as long polling, COMET, BOSH etc.
Straight from the horse’s mouth
Tom DeMarco, the author of the popular book titled Controlling Software Projects: Management, Measurement and Estimation reflects on Software engineering nearly 3 decades later and now has a totally different view on the subject. The article “Software Engineering: An Idea whose time has come and gone?” is a interesting take on the subject based on practical experience. A must read for all those involved in Software development. You can find the article here.
More good news
The future of SaaS is getting brighter by the day. The following three news items are very encouraging and I hope will provide enough momemtum in diminishing the desktop and browser gap. The day of Linux, Windows, Solaris as the main platform is going to be history and the new platforms will be the browser and the desktop.
#TARGET=desktop
TARGET=browser
./configure –target=$TARGET ….
This is exactly what we need; a good competition to get the best out for the community. I hope more players join in the competition.
