How to keep a HTTP connection alive? - javascript

Is there a way to keep a HTTP connection alive with JavaScript?

In short, I think the concept of long lived http connections in javascript really revolve around a style of communication called COMET. This can be achieved in several different ways, but essentially involves the client (using XmlHttp powers) requesting data from the server immediately, and the server withholding the response until some event triggers it. Upon receipt of this response, the client immediately makes another request (which will once again hang at the server end until something needs sending). This simulates server push, but is effectively nothing more than a delayed response used in a clever way. In the worst case, there can be fairly high latency (i.e. 2 messages need sending, so the cycle must be twice repeated, with all the costs involved) but generally, if the messaging rate is low, this gives a reasonable impression of real-time push.
Implementing the server-side for this kind of communication is far from trivial, and requires a good deal of asynchronous communications, concurrency issues and the like. It's quite easy to write an implementation that can support a few hundred users each on their own thread, but to scale to the thousands requires a much more considered approach.

I note that the last answer was given in 2009. Oh, how I remember the days. But lots of good things have happened since then; so I'll add this just to let people know what to look for. HTTP 1.0 provided a "keep-alive" request property that meant that the connection should be maintained for further requests. In HTTP 1.1, this became the default. You actually have to opt-out of it if you don't want to reuse the connection (and if you want to be nice about it).
The new standard for "WebSockets" actually gives you a full-duplex persistent connection. WebSockets are supported in all up-to-date versions of popular browsers and you can even use them in MSIE if you install the Google Chrome Framework (which means Google software is actually doing the work). Microsoft says IE supports it in version 10, but I haven't tried it myself. What you need then is something to connect to, like http://highlevellogic.blogspot.se/2011/09/websocket-server-demonstration_26.html

Related

Best way to determine if new data available

There're many highload sites notify their users about new messages/topics runtime, without page reloading. How do they do that? Which approach do they use?
I assume there's two approaches:
"Asking" the server using JavaScript each time gap
Use websockets
By common opinion, the first one is too heavy for the server, since it produces too many requests.
About second one's behaviour in highload apps I know nothing, is it fine one?
So, which design approach to use to implement functions like "new msg available" properly without the need to reload the page?
The question rather about performance :)
WebSocket performance in the browser is not an issue, and on the server side there are performant implementations. As an example, Crossbar.io can easily handle 180k concurrent connections on a small server (tested in a VM on an older i5 notebook), and 10k/s messages - and both scale with the hardware (RAM and CPU respectively). Also: Something like Crossbar.io/Autobahn/WAMP gives you a protocol on top of WebSockets to handle the distribution of notifications to clients, making your life easier.
Full disclosure: I work for the company that works on Crossbar.io, and there are other WebSocket and PubSub solutions out there. Take a look at what best fits you use case and go with that.

AJAX or Socket.IO make more sense in my situation?

I have been working on something using AJAX that sometimes requires a couple POST's per second. Each post returns a JSON object (generated by the PHP file being posted to, ~11,000 bytes) and on average the latency is between 30ms and 250ms depending on if i'm on wifi or wired, but every roughly 1/15 calls it spikes up to about 4000ms. I am trying to find a way around this, as of right now I see two options:
Throw a timeout on the AJAX call, and have it call a GET on fail (the POST should still go through, it's the return trip that always times out) or...
Cut the entire thing down, learn node.js so I can use websockets to potentially rectify this issue.
Either solution as far as I can see is based on WHY the original call is failing. If it is something wrong with the AJAX call, then a new GET should be likely to go through, and solve the issue. but if it is something with the server itself then logically the GET would just time out as well, it's an issue with the server, and I'm dead in the water.
Since I have no experience yet at all with websockets I was hoping for some feedback on the best action to take next. Thanks for the input.
If it would help, I can very likely reduce the returning payload to 1/15 the size with some sneaky coding. would that make an impact?
WebSockets are a great option! Actually working with SocketIO is pretty simple and has a shallow learning curve.
Because the connection stays open, your requests skips the DNS lookup and routing for lower latency. This mean much lower overhead for each POST request you make.
If you ever forsee pushing data to your users, WebSockets are the de facto way to do it. Ajax polling is going out of style.
That said, you would have to port your back end logic to JavaScript. You would have to change your deployment strategy to a server that supports Node apps. You would have to deal with learning a new environment -- can also have some overhead.
Before you explore Node, consider the drawbacks above. I think it is a great bit of technology, but I would also look into the following approaches, especially if you are pressed for time.
1/15 size reduction is totally worth. Actually, it is worth it in both cases.
Can you do any sort of batching with your POST requests from the client side? If subsequent requests rely on the results of the previous POST request, you cannot do this. In this case, I strongly suggest using WebSockets.
All in all, there are always tradeoffs. If you aren't pressed for time, given Node and SocketIO and whirl, they are becoming very prevalent web technologies and are worth learning.
Cut the entire thing down, learn node.js so I can use websockets to potentially rectify this issue.
There is no point doing things with wrong tools. If you need real-time communication, use servers which support it out-of-the box, like node.js (probably the simplest to get into from PHP).
Since I have no experience yet at all with websockets
Get some framework atop of the raw websockets, like primus or socket.io and good luck ;)

How to achieve realtime updates on my website (with Flask)?

I am using Flask and I want to show the user how many visits that he has on his website in realtime.
Currently, I think a way is to, create an infinite loop which has some delay after every iteration and which makes an ajax request getting the current number of visits.
I have also heard about node.js however I think that running another process might make the computer that its running on slower (i'm assuming) ?
How can I achieve the realtime updates on my site? Is there a way to do this with Flask?
Thank you in advance!
Well, there are many possibilites:
1) Polling - this is exactly what you've described. Infinite loop which makes an AJAX request every now and then. Easy to implement, can be easily done with Flask however quite inefficient - eats lots of resources and scales horribly - making it a really bad choice (avoid it at all costs). You will either kill your machine with it or the notification period (polling interval) will have to be so big that it will be a horrible user experience.
2) Long polling - a technique where a client makes an AJAX request but the server responds to that request only when a notification is available. After receiving the notification the client immediately makes a new request. You will require a custom web server for this - I doubt it can be done with Flask. A lot better then polling (many real websites use it) but could've been more efficient. That's why we have now:
3) WebSockets - truely bidirectional communication. Each client maintains an open TCP connection with the server and can react to incoming data. But again: requires a custom server plus only the most modern browsers support it.
4) Other stuff like Flash or Silverlight or other HTTP tricks (chunked encoding): pretty much the same as no 3). Though more difficult to maintain.
So as you can see if you want something more elegant (and efficient) than polling it requires some serious preparation.
As for processes: you should not worry about that. It's not about how many processes you use but how heavy they are. 1 badly written process can easily kill your machine while 100 well written will work smoothly. So make sure it is written in such a way that it won't freeze your machine (and I assure you that it can be done up to some point defined by number of simultaneous users).
As for language: it doesn't matter whether this is Node.js or any other language (like Python). Pick the one you are feeling better with. However I am aware that there's a strong tendency to use Node.js for such projects and thus there might be more proper libraries out there in the internets. Or maybe not. Python has for example Twisted and/or Tornado specially for that (and probably much much more).
Websocket is an event-driven protocol, which means you can actually use it for truly real-time communication.
Kenneth Reitz wrote an extension named Flask-Sockets that is excellent for websockets:
Article: introducing-flask-sockets
Github: flask-sockets
In my opinion, the best option for achieving real time data streaming to a frontend UI is to use a messaging service like pubnub. They have libraries for any language you are going to want to be using. Basically, your user interfaces subscribe to a data channel. Things which create data then publish to that channel, and all subscribers receive the publish very quickly. It is also extremely simple to implement.
You can use PubNub and specifically PubNub presence meant especially for online presence detection. It provides key features like
Track online and offline status of users and devices in realtime
Occupancy to monitor user and machine presence in realtime
Join/Leave Notification for immediate updates of all client connections
Global Scale with synchronized servers across the PubNub Data Stream Network
PubNub Presence Tutorial provides code that can be downloaded to get presence up and running in a few minutes. Five Ways You Can Use PubNub Presence shows you the different ways you can use PubNub presence.
You can get started by signing up for PubNub and getting your API keys.

Making CPU Bound javascript feel responsive---webworkers?

I am writing a CPU intensive javascript application. I am running into a problem where sometimes the UI is locked while CPU-intensive calculation occurs. I know that the standard approach to solving this is to call setTimeout and let the event loop respond to UI events. However, that doesn't work for me and here's why.
When the page loads, the javascript vm needs to do a bunch of parsing and analyzing of chunks of data. This is truly background stuff, and I am calling setTimeout to run each chunk. However, this means that the user gets a very choppy UI experience until all chunks have been completed (can be up to 10 seconds for large files) and on every save. This is not acceptable.
I can think of 2 solutions, neither of which I really like:
be more granular about the chunks, thus providing more opportunities for the event loop to run. But, I don't like this because the cpu code is already quite complex, but it typically runs well. Calling setTimeout throughout the cpu bound code would make it far more complicated
Do more work on the server. However, I am running a node server and this would simply push the problem from the client to the server, with the added problem of increased bandwidth.
Fixing this would be trivial on a traditional thread-based VM. What should I do for Javascript?
UPDATE:
Some points that I forgot to mention:
We are not concerned with legacy browsers and all users will be required to use a modern Firefox, Chrome, Opera, Safari, IE, etc.
Our initial prototype has the client and server co-located, but there should be nothing preventing us from moving to a remote server.
The data lives on the client (well...obviously, if the client and server are the same machine, but this will be the case even when we move to remote servers).
Webworkers might be the solution, but they do still seem flaky. Does anyone have experience with them? Are they stable? Which modern browsers do not support them well? Are there any general problems with them?
Depending on whether this application will ever become public or not, you have to decide whether you can use Web Workers, split the data up more or do server-side processing. For real-world applications the real solution would be doing heavy computation on the server since you can't expect the user to have the latest processor, it might be a mere netbook which will probably only cough a few times and then crash.
Web workers would be a solution when you can be sure that users have the latest browsers that support it, however if that's not the case, there's no way to shim it like most HTML5 stuff.
Based on what I know about your application, I'd say that you should send precomputed data to the client. Furthermore, Node.js is bad at doing hardcore computations so you might want to look into different data processing options on the server. Also, I don't think bandwidth will be a problem since you have to give the client the initial data anyway. How much bigger is the processed data?

Client notification, should I use an AJAX Push or Poll?

I am working on a simple notification service that will be used to deliver messages to the users surfing a website. The notifications do not have to be sent in real time but it might be a better user experience if they happened more frequently than say every 5 minutes. The data being sent to and from the client is not very large and it is a straight forward database query to retrieve the data.
In reading other conversations on the topic it would appear that an AJAX push can result in higher server loads. Since I can tolerate longer server delays is it worth while to have the server push notifications or to simply poll.
It is not much harder to implement the push scenario and so I thought I would see what the opinion was here.
Thanks for your help.
EDIT:
I have looked into a simple AJAX Push and implemented a simple demo based on this article by Mike Purvis.
The client load is fairly low at around 5k for the initial version and expected to stay that way for quite some time.
Thank you everyone for your responses. I have decided to go with the polling solution but to wrap it all within a utility library so that if they want to change it later it is easier.
I'm surprised noone here has mentioned long-polling. Long polling means keeping an open connection for a longer period (say 30-60 seconds), and once it's closed, re-opening it again, and simply having the socket/connection listen for responses. This results in less connections (but longer ones), and means that responses are almost immediate (some may have to wait for a new polling connection). I'd like to add that in combination with technologies like NodeJS, this results in a very efficient, and resource-light solution, that is 100% browser compatible across all major browsers and versions, and does not require any additional tech like Comet or Flash.
I realize this is an old question, but thought it might still be useful to provide this information :)
Definitely use push its much cooler. If you just want simple notifications I would use something like StreamHub Push Server to do the heavy-lifting for you. Developing your own Ajax Push functionality is an extremely tricky and rocky road - you have to get it working in all browsers and then handle firewalls and proxies killing keep-alive connections etc... Why re-invent the wheel. Also, it has a similarly low footprint of less than 10K so it should suit if that is a priority for you.
Both have diferent requirements and address diferent scenarios.
If you need realtime updates, like in an online chat, push is a must.
But, if the refresh period is big, as it is in your case (5 minutes), then pool is the appropriate solution. Push, in this case, will require a lot of resource from both the client and the server.
Tip! try to make the page that checks the pool fast and clean, so it doesn't consumes a lot of resources in the server in each request. What I usually do is to keep a flag in memory (like in a session variable) that says if the pool is empty or not... so, I only do havy look in the pool only if it is not empty. When the pool is empty, which is most of the time, the page request runs extremely fast.
Because using a push requires an open HTTP connection to be maintained between your server and each client, I'd go for poll as well - not only is that going to consume a lot of server resources but it's also going to be significantly more tricky to implement as matt b mentioned.
My experience with polling is that if you have a frequent enough polling interval on a busy enough site your web server logs can get flooded with poll requests real quickly.
Edit (2017): I'd say your choices are now are between websockets and long polling (mentioned in another answer). Sounds like long polling might be the right choice based on the way the question mentions that the notifications don't need to be received in real time, an infrequent polling period would be pretty easy to implement and shouldn't be very taxing on your server. Websockets are cool and a great choice for many applications these days, sounds like that might be overkill in this case though.
I would implement a poll just because it sounds simpler to write, and keeping it simple is very valuable.
Not sure if you have taken a look at some of the COMET implementations out there (is that what you mean by AJAX push).
If the user is surfing the site, won't that in effect be requesting information from the server that this notification can piggy-back on?
It's impossible to say whether polling will be more expensive then pushing without knowing how many clients you'll have. I'd recommend polling because:
It sounds like you want to update data about once per minute. Unless notifications are able to arrive at a much faster rate than that, pushing would mean you're keeping an HTTP connection open but seeing very little activity on it.
Polling is built on top of existing HTTP conventions, so any server that talks to web browsers is already ready to respond to ordinary Ajax requests. A Comet– or Flash socket–based solution has different requirements; you'll need something like cometd on the server side and a client-side library that groks server-side push.
So if you needed something heavy-duty to manage a torrent of data and a crapload of clients, I'd recommend Comet. But that doesn't seem to be the case.
There's now a service http://pusherapp.com that is trying to solve this problem once and for all, in a blink. Might be worth checking out. (disclaimer: i am in no way associated with them).
I haven't tried it myself, but some say COMET works and is easier than you think. There's also a Ruby on Rails plug-in called Juggernaut that I've heard talked about highly. Again, I haven't used it, so YMMV, but my understanding is that it takes far fewer resources compared to polling. I believe (can someone confirm?) that COMET is how MacRumorsLive.com delivers live blogging of WWDC Stevenotes.

Categories