Why do we need web-sockets?

Why do we need web-sockets? - javascript

This is more of a n00b question, but I've never really known the answer.
so why do we need the websockets protocol?
and, what are the advantages over comet-style/long poll/hanging GET-style use of HTTP?

Comet and Ajax can both deliver end-user experiences that provide desktop-like functionality and low user-perceived latency, only Web Sockets lives up to the promise of providing a native means to accurately and efficiently stream events to and from the browser with negligible latency.
With polling it makes unnecessary requests and, as a result, many connections are opened and closed needlessly in low-message-rate situations.(as with polling it sends HTTP requests at regular intervals and immediately receives a response.)
Web Sockets remove the overhead and dramatically reduce complexity.

1-WebSocket is a naturally full-duplex,
bidirectional, single-socket connection. With WebSocket, your HTTP request becomes a
single request to open a WebSocket connection and reuses the same connection
from the client to the server, and the server to the client.
2-WebSocket reduces latency. For example, unlike polling,
WebSocket makes a single request. The server does not need to wait for a request from
the client. Similarly, the client can send messages to the server at any time. This single
request greatly reduces latency over polling, which sends a request at intervals, regardless
of whether messages are available.
3-WebSocket makes real-time communication much more efficient.
You can always use polling (and sometimes even streaming) over HTTP to receive
notifications over HTTP. However, WebSocket saves bandwidth, CPU power, and latency.
WebSocket is an innovation in performance.
4-WebSocket is an underlying network protocol that enables you to build other standard
protocols on top of it.
5-WebSocket is part of an effort to provide advanced capabilities to HTML5 applications in
order to compete with other platforms.
6-WebSocket is about Simplicity

Here's an article about benefits of websocket over polling at websocket.org

It isn't clear that we do need them. In the scenario of pushing events to the client, a page can make ordinary AJAX GET requests in a loop, and the server can "hang" until events are available. After some timeout passes the server can return a "no events" response so the client will reconnect. During the period where the connection is open and the client is waiting for a response, there is an effective push channel from the server back to the client.
The timeout period can be adjusted to reduce the amount of unnecessary reconnecting, though it cannot typically be infinite because most server frameworks will terminate the server-side process if it appears to have hung for too long.
Given this existing capability, the question is: does a new communication framework really add significant value over what can already be done? It wouldn't really enable something that cannot be done. It would only slightly improve it.

Related

Slow third-party APIs clogging Express server

I am creating a question answering application using Node.js + Express for my back-end. Front-end sends the question data to the back-end, which in turn makes requests to multiple third-party APIs to get the answer data.
Problem is, some of those third-party APIs take too long to respond, since they have to do some intense processing and calculations. For that reason, i have already implemented a caching system that saves answer data for each different question. Nevertheless, that first request each time might take up to 5 minutes.
Since my back-end server waits and does not respond back to the front-end until data arrives (the connections are being kept open), it can only serve 6 requests concurrently (that's what I have found). This is unacceptable in terms of performance.
What would be a workaround to this problem? Is there a way to not "clog" the server, so it can serve more than 6 users?
Is there a design pattern, in which the servers gives an initial response, and then serves the full data?
Perhaps, something that sets the request to "sleep" and opens up space for new connections?

Your server can serve many thousands of simultaneous requests if things are coded properly and it's not CPU intensive, just waiting for network responses. This is something that node.js is particularly good at.
A single browser, however, will only send a few requests at a time (it varies by browser) to the same endpoint (queuing the others until the earlier ones finish). So, my guess is that you're trying to test this from a single browser. That's not going to test what you really want to test because the browser itself is limiting the number of simultaneous requests. node.js is particularly good at having lots of request in flight at the same time. It can easily do thousands.
But, if you really have an operation that takes up to 5 minutes, that probably won't even work for an http request from a browser because the browser will probably time out an inactive connection still waiting for a result.
I can think of a couple possible solutions:
First, you could make the first http request be to just start the process and have it return immediately with an ID. Then, the client can check every 30 seconds of so after that sending the ID in an http request and your server can respond whether it has the result yet or not for that ID. This would be a client-polling solution.
Second, you could establish a webSocket or socket.io connection from client to server. Then, send a message over that socket to start the request. Then, whenever the server finishes its work, it can just send the result directly to the client over the webSocket or socket.io connection. After receiving the response, the client can either keep the webSocket/socket.io connection open for use again in the future or it can close it.

How Websockets are implemented?

How Websockets are implemented?
What is the algorithm behind this new tech (in comparison to Long-Polling)?
How can they be better than Long-Polling in term of performance?
I am asking these questions because here we have a sample code of Jetty websocket implementation (server-side).
If we wait long enough, a timeout will occur, resulting in the
following message on the client.
And that is definately the problem I'm facing when using Long-polling. It stops the process to prevent server overload, doesn't it ?

How Websockets are implemented?
webSockets are implemented as follows:
Client makes HTTP request to server with "upgrade" header on the request
If server agrees to the upgrade, then client and server exchange some security credentials and the protocol on the existing TCP socket is switched from HTTP to webSocket.
There is now a lasting open TCP socket connecting client and server.
Either side can send data on this open socket at any time.
All data must be sent in a very specific webSocket packet format.
Because the socket is kept open as long as both sides agree, this gives the server a channel to "push" information to the client whenever there is something new to send. This is generally much more efficient than using client-driven Ajax calls where the client has to regularly poll for new information. And, if the client needs to send lots of messages to the server (perhaps something like a mnulti-player game), then using an already open socket to send a quick message to the server is also more efficient than an Ajax call.
Because of the way webSockets are initiated (starting with an HTTP request and then repurposing that socket), they are 100% compatible with existing web infrastructure and can even run on the same port as your existing web requests (e.g. port 80 or 443). This makes cross-origin security simpler and keeps anyone on either client or server side infrastructure from having to modify any infrastructure to support webSocket connections.
What is the algorithm behind this new tech (in comparison to
Long-Polling)?
There's a very good summary of how the webSocket connection algorithm and webSocket data format works here in this article: Writing WebSocket Servers.
How can they be better than Long-Polling in term of performance?
By its very nature, long-polling is a bit of a hack. It was invented because there was no better alternative for server-initiated data sent to the client. Here are the steps:
The client makes an http request for new data from the client.
If the server has some new data, it returns that data immediately and then the client makes another http request asking for more data. If the server doesn't have new data, then it just hangs onto the connection for awhile without providing a response, leaving the request pending (the socket is open, the client is waiting for a response).
If, at any time while the request is still pending, the server gets some data, then it forms that data into a response and returns a response for the pending request.
If no data comes in for awhile, then eventually the request will timeout. At that point, the client will realize that no new data was returned and it will start a new request.
Rinse, lather, repeat. Each piece of data returned or each timeout of a pending request is then followed by another ajax request from the client.
So, while a webSocket uses one long-lived socket over which either client or server can send data to the other, the long-polling consists of the client asking the server "do you have any more data for me?" over and over and over, each with a new http request.
Long polling works when done right, it's just not as efficient on the server infrastructure, bandwidth usage, mobile battery life, etc...
What I want is explanation about this: the fact Websockets keep an
open connection between C/S isn't quite the same to Long Polling wait
process? In other words, why Websockets don't overload the server?
Maintaining an open webSocket connection between client and server is a very inexpensive thing for the server to do (it's just a TCP socket). An inactive, but open TCP socket takes no server CPU and only a very small amount of memory to keep track of the socket. Properly configured servers can hold hundreds of thousands of open sockets at a time.
On the other hand a client doing long-polling, even one for which there is no new information to be sent to it, will have to regularly re-establish its connection. Each time it re-establishes a new connection, there's a TCP socket teardown and new connection and then an incoming HTTP request to handle.
Here are some useful references on the topic of scaling:
600k concurrent websocket connections on AWS using Node.js
Node.js w/1M concurrent connections!
HTML5 WebSocket: A Quantum Leap in Scalability for the Web
Do HTML WebSockets maintain an open connection for each client? Does this scale?

Very good explanation about web sockets, long polling and other approaches:
In what situations would AJAX long/short polling be preferred over HTML5 WebSockets?
Long poll - request → wait → response. Creates connection to server like AJAX does, but keep-alive connection open for some time (not long though), during connection open client can receive data from server. Client have to reconnect periodically after connection is closed due to timeouts or data eof. On server side it is still treated like HTTP request same as AJAX, except the answer on request will happen now or some time in the future defined by application logic. Supported in all major browsers.
WebSockets - client ↔ server. Create TCP connection to server, and keep it as long as needed. Server or client can easily close it. Client goes through HTTP compatible handshake process, if it succeeds, then server and client can exchange data both directions at any time. It is very efficient if application requires frequent data exchange in both ways. WebSockets do have data framing that includes masking for each message sent from client to server so data is simply encrypted. support chart (very good)
Overall, sockets have much better performance than long polling and you should use them instead of long polling.

What faster alternatives do I have to websockets for a real-time web application?

I'm planning to write a real time co-op multiplayer game. Currently I'm in the research phase. I've already written a turn-based game which used websockets and it was working fine.
I haven't tried writing a real time game using this technology however. My questions is about websockets. Is there an alternative way to handle communications between (browser) clients? My idea is to have the game state in each client and only send the deltas to the clients using the server as a mediator/synchronization tool.
My main concern is network speed. I want clients to be able to receive each other's actions as fast as possible so my game can stay real time. I have about 20-30 frames per second with less than a KByte of data per frame (which means a maximum of 20-30 KBytes of data per second per client).
I know that things like "network speed" depend on the connection but I'm interested in the "if all else equals" case.

From a standard browser, a webSocket is going to be your best bet. The only two alternatives are webSocket and Ajax. Both are TCP under the covers so once the connection is established they pretty much offer the same transport. But, a webSocket is a persistent connection so you save connection overhead everytime you want to send something. Plus the persistent connection of the webSocket allows you to send directly from server to client which you will want.
In a typical gaming design, the underlying gaming engine needs to adapt to the speed of the transport between the server and any given client. If the client has a slower connection, then you have to drop the number of packets you send to something that can keep up (perhaps fewer frame updates in your case). The connection speed is what it is so you have to make your app deliver the best experience you can at the speed that there is.
Some other things you can do to optimize the use of the transport:
Collect all data you need to send at one time and send it in one larger send operation rather than lots of small sends. In a webSocket, don't send three separate messages each with their own data. Instead, create one message that contains the info from all three messages.
Whenever possible, don't rely on the latency of the connection by sending, waiting for a response, sending again, waiting for response, etc... Instead, try to parallelize operations so you send, send, send and then process responses as they come in.
Tweak settings for outgoing packets from your server so their is no Nagle delay waiting to see if there is other data to go in the same packet. See Nagle's Algorithm. I don't think you have the ability in a browser to tweak this from the client.
Make sure your data is encoded as efficiently as possible for smallest packet size.

How often send Http request to server to check for updates? (Ajax)

I'm developing a small app using Ajax and http requests.
Currently i'm sending one request each second to server for checking if there are updates, and if there are, to get them and download data.
This timing profile is modified when a user interact with the app, but it's negligible.
I'm just wondering.. i could send an infinite loop of requests to the server. more the requests are often, more the app will be speedy. but then doesn't server get too many requests?
but how much is the right time from a request to another one?
I've read something about tokens, but i can't understand which is the better way to check if servers have some updates. thanks in advance

Long polling is one of the main options here. You should look into the server and see that there is good support for persistent HTTP connections if you expect to have a large number of users persistently connected.
For example, Apache webserver on its own opens a thread per connection, that can be a notable challenge with regard to many users with persistent connections, you end up with lots of threads (there are approaches to address it in Apache which you can research further if need be). Jetty, a java based web server (among my personal favorites), uses a more advanced network library that scales connections to threads much more logarithmically and efficiently, essentially putting connections to sleep until traffic in/out is detected.
Here are a few link:
http://en.wikipedia.org/wiki/Push_technology
http://en.wikipedia.org/wiki/Comet_(programming)
http://www.einfobuzz.com/2011/09/reverse-ajax-comet-server-push.html

How does WebSockets scale compared to standard HTTP?

How does a site programmed using TCP (that is, someone on the site is connected to the server and exchanging information via TCP) scales compared to just serving information via AJAX? Say the information exchanged is the same.
Trying to clarify: I'm asking specifially about scale: I've read that keeping thousands of TCP connections is resources (which?) demanding, as compared to just serving information statically. I want to know if this correct.

WebSockets is a technology that allows the server to push notifications to the client. AJAX on the other hand is a pull technology meaning that the client is sending requests to the server.
So for example if you had an application which needed to receive notifications from the server at regular intervals and update its UI, WebSocket is more adapted and much better. With AJAX you will have to hammer your server with requests at regular intervals to see whether some state changed on the server. With WebSockets, it's the server that will notify the client for some event happening on the server. And this will happen in a single request.
So I guess it would really depend on the type of application you are developing but WebSockets and AJAX are two completely different technologies solving different kind of problems. Which one to choose would depend on your scenario.

Websockets are not a one-for-one with AJAX; they offer substantially different features. Websockets offers the ability to 'push' data to the client. AJAX works by 'pushing' data and returning a response.
The purpose of WebSockets is to provide a low-latency, bi-directional, full-duplex and long-running connection between a browser and server. WebSockets opens up possibilities with browser applications that were previously unavailable using HTTP or AJAX.
However, there is certainly an overlap in purpose between WebSockets and AJAX. For example, when the browser wants to be notified of server events (i.e. push) either AJAX or WebSockets are both viable options. If your application needs low-latency push events then this would be a factor in favor of WebSockets which would definitely scale better in this scenario. On the other hand, if you need to work with existing frameworks and deployed technologies (OAuth, RESTful API's, proxies, etc.) then AJAX is preferable.
If you don't need the specific benefits that WebSockets provides, then it's probably a better idea to stick with existing techniques like AJAX because this allows you to re-use and integrate with an existing ecosystem of tools, technologies, security mechanisms, knowledge bases that have been developed over the last 7 years.
But overall, Websockets will outperform AJAX by a significant factor.

I don't think there's any difference when it comes to scalability between WebSockets and standards TCP connnections. WebSocket is an upgrade from a static one way pipe into a duplex one. The physical resources are the exact same.
The main advantage of WebSockets is that they run over port 80, so it avoids most firewall problems, but you have to first connect over standard HTTP.

Here's a good page that clearly shows the benefits of the WebSocket API compared to Ajax long polling (especially on a large scale): http://www.websocket.org/quantum.html
It basically comes down to the fact that once the initial HTTP handshake is established, data can go back and forth much more quickly because the header overhead is greatly reduced (this is what most people refer to as bidirectional communication).
As an off note, if you only need to be able to push data from the server on a regular basis, but you don't need to make many client-initiated requests, then using HTML5 server-sent events with occasional Ajax requests from the client might be just what you need and much easier to implement then the WebSocket API.

We Keep Coding

JavaScript is the programming language of the Web.