We all know that we should minify our JS, CSS and sprite our images. Not only to reduce the size being sent, but also to reduce the number of HTTP requests being sent. And that's what my question is about - what is the reason for limiting parallel AJAX requests by browsers (https://stackoverflow.com/a/14768266/769384 - more details on the limitations)?
AJAX is asynchronous, we don't know when it finishes. The more we parallelize, the less we'd have to wait. Limiting number of requests makes the initial page loading more synchronous-alike, since we have to wait, blocked, until the AJAX slots become free to be used again.
I do have some thoughts on that - but I'd really appreciate a good explanation.
Browsers limit the number of parallel connections to
improve HTTP response times and avoid congestion
This is recommended in the HTTP 1.1 Specfication:
A single-user client SHOULD NOT maintain more than 2 connections with
any server or proxy.
More practically, while perhaps convenient for load testing, not imposing a hard limit can degrade browser performance, and even cause an unintended DoS attack (or your client incorrectly being classified as a DoS attacker by the server).
Related
Ok, so the thing is, we have multiple NodeJS web servers which need to be online all the time. But they'll not be recieving many requests, approx. 100-200 requests a day. The tasks aren't CPU intensive either. We are provisioning EC2 instances for it. So, the question is, can we run multiple nodejs processes on a single core? If not, is it possible to run more low intensity NodeJS processes than number of cores present? What are the pros and cons? Any benchmarks available?
Yes, it is possible. The OS (or VM on top of the OS) will simply share the single CPU among the processes allocated to it. If, as you say, you don't have a lot of requests and those requests aren't very CPU hungry, then everything should work just fine and you probably won't even notice that you're sharing a single CPU among a couple server processes. The OS/VM will time slice among the processes using that CPU, but most of the time you won't even have more than one process asking to use the CPU anyway.
Pros/Cons - Really only that performance might momentarily slow down if both servers get CPU-busy at the same time.
Benchmarks - This is highly dependent upon how much CPU your servers are using and when they try to use it. With the small number of requests you're talking about and the fact that they aren't CPU intensive, it's unlikely a user of either server would even notice. Your CPU is going to be idle most of the time.
If you happen to run a request for each server at the exact same moment and that request would normally take 500ms to complete and most of that was not even CPU time, then perhaps each of these two requests might then take 750ms instead (slightly overlapping CPU time that must be shared). But, most of the time, you're not even going to encounter a request from each of your two servers running at the same time because there are so few requests anyway.
I have to make a million http calls from my nodejs app.
Apart from doing it using async lib, callbacks is there any other way to call these many requests in parallel to process it much faster?
Kindly suggest me on the same
As the title of your question seems to ask, it's a bit of a folly to actually make millions of parallel requests. Having that many requests in flight at the same time will not help you get the job done any quicker and it will likely exhaust many system resources (memory, sockets, bandwidth, etc...).
Instead, if the goal is to just process millions of requests as fast as possible, then you want to do the following:
Start up enough parallel node.js processes so that you are using all the CPU you have available for processing the request responses. If you have 8 cores in each server involved in the process, then start up 8 node.js processes per server.
Install as much networking bandwidth capability as possible (high throughput connection, multiple network cards, etc...) so you can do the networking as fast as possible.
Use asynchronous I/O processing for all I/O so you are using the system resources as efficiently as possible. Be careful about disk I/O because async disk I/O in node.js actually uses a limited thread pool internal to the node implementation so you can't have an indefinite number of async disk I/O requests actually in flight at the same time. You won't get an error if you try to do this (the excess requests will just be queued), but it won't help you with performance either. Networking in node.js is truly async so it doesn't have this issue.
Open only as many simultaneous requests per node.js process as actually benefit you. How many this is (likely somewhere between 2 and 20) depends upon how much of the total time to process a request is networking vs. CPU and how slow the responses are. If all the requests are going to the same remote server, then saturating it with requests likely won't help you either because you're already asking it to do as much as it can do.
Create a coordination mechanism among your multiple node.js processes to feed each one work and possibly collect results (something like a work queue is often used).
Test like crazy and discover where your bottlenecks are and investigate how to tune or change code to reduce the bottlenecks.
If your requests are all to the same remote server then you will have to figure out how it behaves with multiple requests. A larger server farm will probably not behave much differently if you fire 10 requests at it at once vs. 100 requests at once. But, a single smaller remote server might actually behave worse if you fire 100 requests at it at once. If your requests are all to different hosts, then you don't have this issue at all. If your requests are to a mixture of different hosts and same hosts, then it may pay to spread them around to different hosts so that you aren't making 100 at once of the same host.
The basic ideas behind this are:
You want to maximize your use of the CPU so each CPU is always doing as much as it can.
Since your node.js code is single threaded, you need one node.js process per core in order to maximize your use of the CPU cycles available. Adding additional node.js processes beyond the number of cores will just incur unnecessary OS context switching costs and probably not help performance.
You only need enough parallel requests in flight at the same time to keep the CPU fed with work. Having lots of excess requests in flight beyond what is needed to feed the CPU just increases memory usage beyond what is helpful. If you have enough memory to hold the excess requests, it isn't harmful to have more, but it isn't helpful either. So, ideally you'd set things to have a few more requests in flight at a time than are needed to keep the CPU busy.
When calling back to the same server, at what point am I better off making one bigger call, versus multiple parallel requests.
In my particular case, assume that the server processing time (not including request processing, etc) is linear (e.g. 1 big call asking for 3 bits of data takes the same processing time as 3 smaller calls).
I know that if I have a 1000 calls, I am better off batching them so as to not incur all the network overhead. But if I only have 2, I'm assuming parallel requests are probably better.
Is this right?
If so, where is the cutoff?
TL;DR: It depends on a number of factors that are highly dependant on your setup. If performance is a huge concern of yours, I would run tests, either with a 3rd party application like Wireshark, or write some performance testing code on the server. In general though, limit your amount of parallel requests to under a handful if possible, by concatenating them.
In general, a few requests (in parallel) are okay. A modern browser will attempt to run them in parallel as much as possible over the TCP stream.
That being said, this starts to get bloated because every single request you make at your server using the HTTP/1.* protocol comes with headers, which can be huge, as they contain things like the referrer and browser cookies. The request body might be one character, but the request itself will be much larger.
Furthermore, the scenario changes with HTTP/2 (or even SPDY), the new transfer protocol. Requests over the wire here are treated differently, and don't always carry the extra weight of all the header metadata that normal requests do. So, if your server and browser support HTTP/2, you might be able to run more requests in parallel.
For the most part, though, you'll be running over HTTP/1.*, which means any more than a couple requests in parallel can see a serious performance impact (in the scenario you described for server processing time) for total completion time over one large load.
There's one other thing to consider though, too, which is application dependant: when does that data matter? If you batch what would have been a ton of small requests into one larger one, none of the return data will come back until the entire operation is complete server-side. If you need to display data more rapidly, or you want to load things step-by-step for slower network conditions, the performance trade-off might be worth it for multiple small requests.
Hope this explanation helps.
Definitely read up on the FAQ for HTTP/2: they also cover some of the same performance issues you'll run into with HTTP/1.* in the scenario you described
For years, web developers have followed the logic that minimizing HTTP connections speeds up applications because the browser isn't choking on the download/execution of code. For example Yahoo has long touted their best practices, and tell us to combine CSS/JavaScript/image resources into single files - thereby reducing the total number of HTTP requests and compressing the total resource size.
But other "best practices" exist with regards to increasing webpage speed - specifically, maximizing the number of parallel HTTP downloads (from Google). This approach tells us that by spreading the HTTP connections across multiple hostnames the browser can do more simultaneously.
So as modern web applications are becoming very large (e.g. 3MB+ of JavaScript alone) the question must be asked:
Will my application load faster with 3MB+ of JavaScript in a single file? Or will it load faster with multiple, smaller files spread across hostnames?
For the sake of simplicity we should also assume other "best practices" are being followed, so this question best exists in a vacuum.
I have yet to see any empirical data on the subject, but I imagine there has to be a point where the performance of these approaches diverge - so knowing where that sweet-spot exists would be ideal.
I think this depends on number of sockets available for the browser. Say the browser has it's 4 sockets available, 4 smaller requests will be faster than 1 large request.
The trick here would be knowing at startup what requests your application will send and maximize the # of requests for # of sockets a browser can use. I believe browsers only have 4 but to be honest I haven't ever looked to see if that number has changed in modern browsers.
Looks like each browser can have it's own number of sockets, some having 2: Max parallel http connections in a browser?
https://stackoverflow.com/a/985704/925782 says IE10 is winner with 8 sockets, wow, go IE :)
Cache control would also play part in this of course where first load would be everything, subsequent loads would be less actual requests.
If you want to get geeky: http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/
I agree that some charts and real data would be a great blog post, my response is purely theoretical in nature.
I would pick parallel downloads.
Smaller JS files can be parsed faster than one monster-sized package. In most of the cases you do not need all of the JS at once either.
Concatenating assets is considered better practice currently because of expensive http requests. One of HTTP/2.0 goals is to make it cheap by multiplexing requests within same tcp connection. Server push in HTTP/2.0 can leverage it even more by sending some essential assets to the client ahead of time.
Chrome, FF, Opera and IE11 already support HTTP/2.0, and its support is available for popular webservers (apache, nginx)
Is there a way to keep a HTTP connection alive with JavaScript?
In short, I think the concept of long lived http connections in javascript really revolve around a style of communication called COMET. This can be achieved in several different ways, but essentially involves the client (using XmlHttp powers) requesting data from the server immediately, and the server withholding the response until some event triggers it. Upon receipt of this response, the client immediately makes another request (which will once again hang at the server end until something needs sending). This simulates server push, but is effectively nothing more than a delayed response used in a clever way. In the worst case, there can be fairly high latency (i.e. 2 messages need sending, so the cycle must be twice repeated, with all the costs involved) but generally, if the messaging rate is low, this gives a reasonable impression of real-time push.
Implementing the server-side for this kind of communication is far from trivial, and requires a good deal of asynchronous communications, concurrency issues and the like. It's quite easy to write an implementation that can support a few hundred users each on their own thread, but to scale to the thousands requires a much more considered approach.
I note that the last answer was given in 2009. Oh, how I remember the days. But lots of good things have happened since then; so I'll add this just to let people know what to look for. HTTP 1.0 provided a "keep-alive" request property that meant that the connection should be maintained for further requests. In HTTP 1.1, this became the default. You actually have to opt-out of it if you don't want to reuse the connection (and if you want to be nice about it).
The new standard for "WebSockets" actually gives you a full-duplex persistent connection. WebSockets are supported in all up-to-date versions of popular browsers and you can even use them in MSIE if you install the Google Chrome Framework (which means Google software is actually doing the work). Microsoft says IE supports it in version 10, but I haven't tried it myself. What you need then is something to connect to, like http://highlevellogic.blogspot.se/2011/09/websocket-server-demonstration_26.html