In the web world a web browser makes a new request for every static file it has to retrieve, so; a stylesheet, javascript file, inline image - all initiate a new server request. Whilst my knowledge of the web is pretty good, the underlying technologies like websockets are somewhat new to me in how they work and what they are capable of.
My question is rather theoretical, but I am wondering if it's possible now or would ever be possible to serve static files via a websocket? Considering websockets are a persistent connection from the client (web browser) to the server, it makes sense that websockets could be used for serving some if not all static content as it would just be one connection as opposed to many.
To clarify a little bit.
I realise my wording about connections was incorrect as pointed out by Greg below. But from what I understand the reason CDN's were created and are still used today is to address the issue with browsers and or servers having a hard limit on the number of concurrent downloads, once you hit that limit your requests are then queued thus adding to the page load time. I am aware they were also created to provide cookie-less requests as well. So really my question should be: "Can websockets be used in place of a CDN?"
BrowserScope has some useful metrics, it appears as though the request limit is about 6 per hostname for most modern browsers and even IE8. But as I said sometimes people have more than 6 resources, does this mean they're being queued and slowing the page load time where websockets could potentially reduce this to one?
It's definitely possible but there are a few reasons why you probably don't want to use this for static resources:
You need at least one resource that is statically delivered over the standard HTTP mechanism which means you need something capable of serving static resources anyways. Generally you want to keep Javascript separate from your HTML which would mean another static load. Or you can be messy and put the WebSocket code embedded on the main page, but you still are really any better off yet.
You can't open WebSocket connections until a script on the page starts running. Establishing the WebSocket connection adds some initial latency.
Most browsers will load non-conflicting static resources in parallel (some older browsers have a severe limit on the number of parallel connections, but they still have some parallelization). You could open multiple WebSocket connections for different static resources, but doing this reliably and efficiently is going to take a lot of effort. Browsers have already solved most of these issues for static resources.
Each WebSocket connection is a guaranteed order message based transport. Combined with the serialized nature of Javascript execution this effectively means you get to process one WebSocket message at a time. You could use Web Workers to be able to process more than one WebSocket connection in parallel, but the main render script will still be serialized across those connections. You could certainly make this efficient, but once again, this isn't a trivial problem and browsers have already solved a lot of these static resource loading problems.
Many web servers support gziping resources before delivering them. WebSocket does not yet have compression support (it's being discussed as an extension in the working group). This means if you want to compress your resources over WebSocket you will have to do this in Javascript which will add more latency.
If you have parts of your page that are dynamically updated using static resources (e.g. loading in new images into a HTML5 canvas game), then WebSockets may be your best option because an already established WebSocket connection will have low latency and overhead for getting pushed updates from the server then getting these delivered over HTTP. But I wouldn't recommend using WebSockets for the initial static resources when you page first loads.
This answer does not really address your web sockets question, but it may make it obsolete:
The next generation technology that is supposed to solve the problem of transferring several assets over a single connection is SPDY, which is was a candidate for HTTP 2.0. It has working implementations in Chrome and Firefox and already some experimental server-side support by the likes of Google and Twitter.
Edit: SPDY protocol is now deprecated. You can however look into it for research purposes.
Related
I am designing an architecture for a web application using Node.js, and we need to be able to send medium size files to the client from a gallery. As a user browses the gallery, they will be sent these binary files as fast as possible(for each gallery item). The files could go up to 6Mb, but probably average around 2Mb.
My client is insisting that we should use websockets for data transfer instead of XHR. Just to be clear, we don't need bi-directional communication.
I lack the experience in this domain and need help in my reasoning.
My points so far are the following:
Using WebSockets breaks any client-side caching that would be provided by HTTP. Users would be forced to re-download content if they visited the same item in the gallery twice.
WebSocket messages cannot be handled by/routed to proxy caches. They must always be handled by an explicit server.
CDNs are built to provide extensive web caching, intercepting HTTP requests. WebSockets would limit us from leveraging CDNs.
I guess that Nodejs would be able to respond faster to hundreds/thousands of XHR than concurrent websocket connections.
Are there any technical arguments for/against using websockets for pure data transfer over standard HTTPRequests. Can anyone nullify/clarify my points and maybe provide links to help with my research?
I found this link very helpful: https://www.mnot.net/cache_docs/#PROXY
Off the top of my head, I can see the following technical arguments for XHR besides that it uses HTTP and is therefore better at caching (which is essential for speed):
HTTP is the dedicated protocol for file downloads. It's natively built into browsers (with the XHR interface), therefore better optimised and easier to use for the developer
HTTP already features a lot of the things you'd need to hand-craft with websockets, like file path requests, auth, sessions, caching… All both on the client and server side.
XHR has better support even in older browsers
some firewalls only allow HTTP(S) connections
There do not seem to be any technical reasons to prefer web sockets - the only thing that might affect your choice is "the client is king". You might be able to convince him though by telling him how much he has to pay you to reimplement the HTTP features on a websocket connection. It ain't cheap, especially when your application gets more complex.
Btw, I wouldn't support your last point. Node should be able to deal with just as many websocket connections as HTTP connections; if properly optimised all things are even. However, if your server architecture is not based solely on node, there is a multitude of plain file serving applications that are probably faster than node (not even counting the HTTP caching layer).
Of course I am aware of Ajax, but the problem with Ajax is that the browser should poll the server frequently to find whether there is new data. This increases server load.
Is there any better method (even using Ajax) other than polling the server frequently?
Yes, what you're looking for is COMET http://en.wikipedia.org/wiki/Comet_(programming). Other good Google terms to search for are AJAX-push and reverse-ajax.
Yes, it's called Reverse Ajax or Comet. Comet is basically an umbrella term for different ways of opening long-lived HTTP requests in order to push data in real-time to a web browser. I'd recommend StreamHub Push Server, they have some cool demos and it's much easier to get started with than any of the other servers. Check out the Getting Started with Comet and StreamHub Tutorial for a quick intro. You can use the Community Edition which is available to download for free but is limited to 20 concurrent users. The commercial version is well worth it for the support alone plus you get SSL and Desktop .NET & Java client adapters. Help is available via the Google Group, there's a good bunch of tutorials on the net and there's a GWT Comet adapter too.
Nowadays you should use WebSockets.
This is 2011 standard that allows to initiate connections with HTTP and then upgrade them to two-directional client-server message-based communication.
You can easily initiate the connection from javascript:
var ws = new WebSocket("ws://your.domain.com/somePathIfYouNeed?args=any");
ws.onmessage = function (evt)
{
var message = evt.data;
//decode message (with JSON or something) and do the needed
};
The sever-side handling depend on your tenchnology stack.
Look into Comet (a spoof on the fact that Ajax is a cleaning agent and so is Comet) which is basically "reverse Ajax." Be aware that this requires a long-lived server connection for each user to receive notifications so be aware of the performance implications when writing your app.
http://en.wikipedia.org/wiki/Comet_(programming)
Comet is definitely what you want. Depending on your language/framework requirements, there are different server libraries available. For example, WebSync is an IIS-integrated comet server for ASP.NET/C#/IIS developers, and there are a bunch of other standalone servers as well if you need tighter integration with other languages.
I would strongly suggest to invest some time on Comet, but I dont know an actual implementation or library you could use.
For an sort of "callcenter control panel" of a web app that involved updating agent and call-queue status for a live Callcenter we developed an in-house solution that works, but is far away from a library you could use.
What we did was to implement a small service on the server that talks to the phone-system, waits for new events and maintains a photograph of the situation. This service provides a small webserver.
Our web-clients connects over HTTP to this webserver and ask for the last photo (coded in XML), displays it and then goes again, asking for the new photo. The webserver at this point can:
Return the new photo, if there is one
Block the client for some seconds (30 in our setup) waiting for some event to ocurr and change the photograph. If no event was generated at that point, it returns the same photo, only to allow the connection to stay alive and not timeout the client.
This way, when clients polls, it get a response in 0 to 30 seconds max. If a new event was already generated it gets it immediately), otherwise it blocks until new event is generated.
It's basically polling, but it somewhat smart polling to not overheat the webserver. If Comet is not your answer, I'm sure this could be implemented using the same idea but using more extensively AJAX or coding in JSON for better results. This was designed pre-AJAX era, so there are lots of room for improvement.
If someone can provide a actual lightweight implementation of this, great!
An interesting alternative to Comet is to use sockets in Flash.
Yet another, standard, way is SSE (Server-Sent Events, also known as EventSource, after the JavaScript object).
Comet was actually coined by Alex Russell from Dojo Toolkit ( http://www.dojotoolkit.org ). Here is a link to more infomration http://cometdproject.dojotoolkit.org/
There are other methods. Not sure if they are "better" in your situation. You could have a Java applet that connects to the server on page load and waits for stuff to be sent by the server. It would be a quite a bit slower on start-up, but would allow the browser to receive data from the server on an infrequent basis, without polling.
You can use a Flash/Flex application on the client with BlazeDS or LiveCycle on the server side. Data can be pushed to the client using an RTMP connection. Be aware that RTMP uses a non standard port. But you can easily fall back to polling if the port is blocked.
It's possible to achive what you're aiming at through the use of persistent http connections.
Check out the Comet article over at wikipedia, that's a good place to start.
You're not providing much info but if you're looking at building some kind of event-driven site (a'la digg spy) or something along the lines of that you'll probably be looking at implementing a hidden IFRAME that connects to a url where the connection never closes and then you'll push script-tags from the server to the client in order to perform the updates.
Might be worth checking out Meteor Server which is a web server designed for COMET. Nice demo and it also is used by twitterfall.
Once a connection is opened to the server it can be kept open and the server can Push content a long while ago I did with using multipart/x-mixed-replace but this didn't work in IE.
I think you can do clever stuff with polling that makes it work more like push by not sending content unchanged headers but leaving the connection open but I've never done this.
You could try out our Comet Component - though it's extremely experimental...!
please check this library https://github.com/SignalR/SignalR to know how to push data to clients dynamically as it becomes available
You can also look into Java Pushlets if you are using jsp pages.
Might want to look at ReverseHTTP also.
I currently have a javascript library that is using a JSON file to print them on the screen in an interactive way. (::We are using D3JS Library)
When we are on a client, we can easily delete, edit and create some nodes, that are updated in the JSON every 5-10 seconds.
The problem comes from two main facts :
First the automatic function that call itself every x seconds could make data corruption if we are doing some stuff on the datas already represented on the screen.
Then the project has been made in order to permit 5 people to interact together. When they are present onto the same session we cannot decently make them refresh every 5 seconds, that cause many overhead and doesn't avoid data corruption.
We have mainly thought about a solution only made with javascript and some AJAX but we realize that it should be reconsidered with a trigger that inform the client that the datas are no longer OK.
We are thinking currently about opening a script onto a server in order to attribute on each client an ID.
The goal would be to detect the modification done on the JSON file (on the server). But the point where we are stuck is :
1) Is there a best scripting language to interact server/web?
2) Which type of things should we use to make the clients update their datas? (socket right?)
About the second point the easiest way would be to call a JS function be we aren't aware of the possibilities given by the shell codes...
Sorry about the fact that we are happy developpers but maybe not enough skilled to solve this problem.
Thanks for your helps !
You can achieve that using pure javascript with the new WebSocket feature.
http://www.html5rocks.com/en/tutorials/websockets/basics/
Edit:
WebSocket is a web technology providing full-duplex communications channels over a single TCP connection. The WebSocket API is being standardized by the W3C, and the WebSocket protocol has been standardized by the IETF as RFC 6455.
WebSocket is designed to be implemented in web browsers and web servers, but it can be used by any client or server application. The WebSocket Protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.[1] The WebSocket protocol makes possible more interaction between a browser and a web site, facilitating live content and the creation of real-time games. This is made possible by providing a standardized way for the server to send content to the browser without being solicited by the client, and allowing for messages to be passed back and forth while keeping the connection open.
How does a site programmed using TCP (that is, someone on the site is connected to the server and exchanging information via TCP) scales compared to just serving information via AJAX? Say the information exchanged is the same.
Trying to clarify: I'm asking specifially about scale: I've read that keeping thousands of TCP connections is resources (which?) demanding, as compared to just serving information statically. I want to know if this correct.
WebSockets is a technology that allows the server to push notifications to the client. AJAX on the other hand is a pull technology meaning that the client is sending requests to the server.
So for example if you had an application which needed to receive notifications from the server at regular intervals and update its UI, WebSocket is more adapted and much better. With AJAX you will have to hammer your server with requests at regular intervals to see whether some state changed on the server. With WebSockets, it's the server that will notify the client for some event happening on the server. And this will happen in a single request.
So I guess it would really depend on the type of application you are developing but WebSockets and AJAX are two completely different technologies solving different kind of problems. Which one to choose would depend on your scenario.
Websockets are not a one-for-one with AJAX; they offer substantially different features. Websockets offers the ability to 'push' data to the client. AJAX works by 'pushing' data and returning a response.
The purpose of WebSockets is to provide a low-latency, bi-directional, full-duplex and long-running connection between a browser and server. WebSockets opens up possibilities with browser applications that were previously unavailable using HTTP or AJAX.
However, there is certainly an overlap in purpose between WebSockets and AJAX. For example, when the browser wants to be notified of server events (i.e. push) either AJAX or WebSockets are both viable options. If your application needs low-latency push events then this would be a factor in favor of WebSockets which would definitely scale better in this scenario. On the other hand, if you need to work with existing frameworks and deployed technologies (OAuth, RESTful API's, proxies, etc.) then AJAX is preferable.
If you don't need the specific benefits that WebSockets provides, then it's probably a better idea to stick with existing techniques like AJAX because this allows you to re-use and integrate with an existing ecosystem of tools, technologies, security mechanisms, knowledge bases that have been developed over the last 7 years.
But overall, Websockets will outperform AJAX by a significant factor.
I don't think there's any difference when it comes to scalability between WebSockets and standards TCP connnections. WebSocket is an upgrade from a static one way pipe into a duplex one. The physical resources are the exact same.
The main advantage of WebSockets is that they run over port 80, so it avoids most firewall problems, but you have to first connect over standard HTTP.
Here's a good page that clearly shows the benefits of the WebSocket API compared to Ajax long polling (especially on a large scale): http://www.websocket.org/quantum.html
It basically comes down to the fact that once the initial HTTP handshake is established, data can go back and forth much more quickly because the header overhead is greatly reduced (this is what most people refer to as bidirectional communication).
As an off note, if you only need to be able to push data from the server on a regular basis, but you don't need to make many client-initiated requests, then using HTML5 server-sent events with occasional Ajax requests from the client might be just what you need and much easier to implement then the WebSocket API.
I am looking at facebook news feed/ticker right now and I am wondering what technology/architecture it uses to pull in data asynchronously when any of my connections make an update. One possibility that I can think of is a javascript setInterval on a function that aggressively polls the server for new data.
I wonder how efficient that is.
Another possible technology that I can think of is something like Comet/NodeJS architecture that pings the client when there is an update on the server. I am not too familiar with this technology.
If I wanted to create something similar to this. What should I be looking into? Is the first approach the preferred way to do this? What technologies are available out there that will allow me to do this?
There are several technologies to achieve this:
polling: the app makes a request every x milliseconds to check for updates
long polling: the app makes a request to the server, but the server only responds when it has new data available (usually if no new data is available in X seconds, an empty response is sent or the connection is killed)
forever frame: a hidden iframe is opened in the page and the request is made for a doc that relies on HTTP 1.1 chunked encoding
XHR streaming: allows successive messages to be sent from the server without requiring a new HTTP request after each response
WebSockets: this is the best option, it keeps the connection alive at all time
Flash WebSockets: if WS are not natively supported by the browser, then you can include a Flash script to enhance that functionality
Usually people use Flash WebSockets or long-polling when WebSockets (the most efficient transport) is not available in the browser.
A perfect example on how to combine many transport techniques and abstract them away is Socket.IO.
Additional resources:
http://en.wikipedia.org/wiki/Push_technology
http://en.wikipedia.org/wiki/Comet_(programming))
http://www.leggetter.co.uk/2011/08/25/what-came-before-websockets.html
Server polling with JavaScript
Is there a difference between long-polling and using Comet
http://techoctave.com/c7/posts/60-simple-long-polling-example-with-javascript-and-jquery
Video discussing different techniques: http://vimeo.com/27771528
The book Even Faster Websites has a full chapter (ch. 8) dedicated to 'Scaling with Comet'.
I could be wrong, but I think that Facebook relies on a "long polling" technique that keeps an http connection open to a server for a fixed amount of time. The data sent from the server triggers an event client side that is acted upon at that time. I would imagine that they use this technique to support the older browsers that do not have websocket support built in.
I, personally, have been working on an application with similar requirements and have opted to use a combination of node.js and socket.io. The socket.io module uses a variety of polling solutions and automatically chooses the best one based on what is available on the client.
Maybe you may have a look to Goliath (non-blocking IO server written in Ruby) : http://postrank-labs.github.com/goliath/