Minimizing HTTP Connections vs. Parallel Downloads

Minimizing HTTP Connections vs. Parallel Downloads - javascript

For years, web developers have followed the logic that minimizing HTTP connections speeds up applications because the browser isn't choking on the download/execution of code. For example Yahoo has long touted their best practices, and tell us to combine CSS/JavaScript/image resources into single files - thereby reducing the total number of HTTP requests and compressing the total resource size.
But other "best practices" exist with regards to increasing webpage speed - specifically, maximizing the number of parallel HTTP downloads (from Google). This approach tells us that by spreading the HTTP connections across multiple hostnames the browser can do more simultaneously.
So as modern web applications are becoming very large (e.g. 3MB+ of JavaScript alone) the question must be asked:
Will my application load faster with 3MB+ of JavaScript in a single file? Or will it load faster with multiple, smaller files spread across hostnames?
For the sake of simplicity we should also assume other "best practices" are being followed, so this question best exists in a vacuum.
I have yet to see any empirical data on the subject, but I imagine there has to be a point where the performance of these approaches diverge - so knowing where that sweet-spot exists would be ideal.

I think this depends on number of sockets available for the browser. Say the browser has it's 4 sockets available, 4 smaller requests will be faster than 1 large request.
The trick here would be knowing at startup what requests your application will send and maximize the # of requests for # of sockets a browser can use. I believe browsers only have 4 but to be honest I haven't ever looked to see if that number has changed in modern browsers.
Looks like each browser can have it's own number of sockets, some having 2: Max parallel http connections in a browser?
https://stackoverflow.com/a/985704/925782 says IE10 is winner with 8 sockets, wow, go IE :)
Cache control would also play part in this of course where first load would be everything, subsequent loads would be less actual requests.
If you want to get geeky: http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/
I agree that some charts and real data would be a great blog post, my response is purely theoretical in nature.

I would pick parallel downloads.
Smaller JS files can be parsed faster than one monster-sized package. In most of the cases you do not need all of the JS at once either.
Concatenating assets is considered better practice currently because of expensive http requests. One of HTTP/2.0 goals is to make it cheap by multiplexing requests within same tcp connection. Server push in HTTP/2.0 can leverage it even more by sending some essential assets to the client ahead of time.
Chrome, FF, Opera and IE11 already support HTTP/2.0, and its support is available for popular webservers (apache, nginx)

Related

is combining all js files into a single file necessarily faster?

I always hear in production, you want to combine multiple .js files into 1 to make it load faster.
But since browser actually makes multiple request concurrently, there's a chance that multiple files can be loaded faster than a single file, which has to be downloaded from beginning to end.
Is this reasoning correct?

It's a complex area.
The browser making multiple concurrent connections to the same server (which are usually quite limited in number) doesn't make the connection between the client and server faster. The pipes between them are only so big, and the server only has so much delivery capacity. So there's little if any reason to believe 4 parallel downloads, each of 10k, from the same server are likely to be faster than 1 download of 40k from that server. Add to that the fact that browsers limit the number of concurrent connections to the same server, and the expense of setting up those individual connections (which is non-trivial), and you're still better off with one large file for your own scripts.
For now. This is an area being actively developed by Google and others.
If you can load scripts from multiple servers (for instance, perhaps load common libraries from any of the several CDNs that make them accessible, and your own single combined script from your own server [or CDN]), it can make sense to separate those. It doesn't make the client's connection faster, but if the client's connection isn't the limiting factor, you can get a benefit. And of course, for a site that doesn't justify having its own CDN, loading common libraries from the free CDNs and just your own scripts from your own server lets you get the advantage of edge-casting and such on the scripts you load from the free CDNs.

For Large JS files:
Not Good idea,If you have small JS files then its good idea to merage otherwise
suppose if JS files is more than 500kbs then single file will make in
MBS and take huge loading HTTP request time.
For small JS files:
Good idea ,for small it has good idea but its better to use only 3rd party tool
which will also compress your final single file so that HTTP request
time will take less time. I would suggest using PHP Minify(but you can find other which suit you), which lets
you create a single HTTP request for a group of JS or CSS files.
Minify also handles GZipping, Compression, and HTTP Headers for client
side caching.
demo status of PHP minify
Before
After

It depends on if your server is HTTP/2 or HTTP/1.1.
HTTP/2
HTTP/2 (H2) allows a server to quickly respond to multiple requests, allowing the client to streamline all the requests without waiting for the first one to return and parse. This helps to mitigate the need for concatenation, but doesn't entirely remove it. See this post for an in-depth answer to when you should or shouldn't concatenate.
Another thing to keep in mind is that if your server gzips your assets, it can actually be better to concatenate some of them together since gzipping can perform better on larger files with lots of repeating text. By separating all your files out, you could actually hurt your overall performance. Finding the most optimal solution will require some trial and error (a lot of this is still new and so best practices are still being discovered).
HTTP/1.1
With HTTP/1.1, as the other answers have pointed out, for the majority of cases combining all your files into one is better. This reduces the number of HTTP requests, which can be slow with HTTP/1.1. There are ways to mitigate this by requesting assets from different subdomains to allow multiple concurrent requests.
I recommend reading High Performance Browser Networking for a complete understanding on strategies for HTTP/1.1.

How to make millions of parallel http requests from nodejs app?

I have to make a million http calls from my nodejs app.
Apart from doing it using async lib, callbacks is there any other way to call these many requests in parallel to process it much faster?
Kindly suggest me on the same

As the title of your question seems to ask, it's a bit of a folly to actually make millions of parallel requests. Having that many requests in flight at the same time will not help you get the job done any quicker and it will likely exhaust many system resources (memory, sockets, bandwidth, etc...).
Instead, if the goal is to just process millions of requests as fast as possible, then you want to do the following:
Start up enough parallel node.js processes so that you are using all the CPU you have available for processing the request responses. If you have 8 cores in each server involved in the process, then start up 8 node.js processes per server.
Install as much networking bandwidth capability as possible (high throughput connection, multiple network cards, etc...) so you can do the networking as fast as possible.
Use asynchronous I/O processing for all I/O so you are using the system resources as efficiently as possible. Be careful about disk I/O because async disk I/O in node.js actually uses a limited thread pool internal to the node implementation so you can't have an indefinite number of async disk I/O requests actually in flight at the same time. You won't get an error if you try to do this (the excess requests will just be queued), but it won't help you with performance either. Networking in node.js is truly async so it doesn't have this issue.
Open only as many simultaneous requests per node.js process as actually benefit you. How many this is (likely somewhere between 2 and 20) depends upon how much of the total time to process a request is networking vs. CPU and how slow the responses are. If all the requests are going to the same remote server, then saturating it with requests likely won't help you either because you're already asking it to do as much as it can do.
Create a coordination mechanism among your multiple node.js processes to feed each one work and possibly collect results (something like a work queue is often used).
Test like crazy and discover where your bottlenecks are and investigate how to tune or change code to reduce the bottlenecks.
If your requests are all to the same remote server then you will have to figure out how it behaves with multiple requests. A larger server farm will probably not behave much differently if you fire 10 requests at it at once vs. 100 requests at once. But, a single smaller remote server might actually behave worse if you fire 100 requests at it at once. If your requests are all to different hosts, then you don't have this issue at all. If your requests are to a mixture of different hosts and same hosts, then it may pay to spread them around to different hosts so that you aren't making 100 at once of the same host.
The basic ideas behind this are:
You want to maximize your use of the CPU so each CPU is always doing as much as it can.
Since your node.js code is single threaded, you need one node.js process per core in order to maximize your use of the CPU cycles available. Adding additional node.js processes beyond the number of cores will just incur unnecessary OS context switching costs and probably not help performance.
You only need enough parallel requests in flight at the same time to keep the CPU fed with work. Having lots of excess requests in flight beyond what is needed to feed the CPU just increases memory usage beyond what is helpful. If you have enough memory to hold the excess requests, it isn't harmful to have more, but it isn't helpful either. So, ideally you'd set things to have a few more requests in flight at a time than are needed to keep the CPU busy.

Javascript multiple calls or 1 big one

When calling back to the same server, at what point am I better off making one bigger call, versus multiple parallel requests.
In my particular case, assume that the server processing time (not including request processing, etc) is linear (e.g. 1 big call asking for 3 bits of data takes the same processing time as 3 smaller calls).
I know that if I have a 1000 calls, I am better off batching them so as to not incur all the network overhead. But if I only have 2, I'm assuming parallel requests are probably better.
Is this right?
If so, where is the cutoff?

TL;DR: It depends on a number of factors that are highly dependant on your setup. If performance is a huge concern of yours, I would run tests, either with a 3rd party application like Wireshark, or write some performance testing code on the server. In general though, limit your amount of parallel requests to under a handful if possible, by concatenating them.
In general, a few requests (in parallel) are okay. A modern browser will attempt to run them in parallel as much as possible over the TCP stream.
That being said, this starts to get bloated because every single request you make at your server using the HTTP/1.* protocol comes with headers, which can be huge, as they contain things like the referrer and browser cookies. The request body might be one character, but the request itself will be much larger.
Furthermore, the scenario changes with HTTP/2 (or even SPDY), the new transfer protocol. Requests over the wire here are treated differently, and don't always carry the extra weight of all the header metadata that normal requests do. So, if your server and browser support HTTP/2, you might be able to run more requests in parallel.
For the most part, though, you'll be running over HTTP/1.*, which means any more than a couple requests in parallel can see a serious performance impact (in the scenario you described for server processing time) for total completion time over one large load.
There's one other thing to consider though, too, which is application dependant: when does that data matter? If you batch what would have been a ton of small requests into one larger one, none of the return data will come back until the entire operation is complete server-side. If you need to display data more rapidly, or you want to load things step-by-step for slower network conditions, the performance trade-off might be worth it for multiple small requests.
Hope this explanation helps.
Definitely read up on the FAQ for HTTP/2: they also cover some of the same performance issues you'll run into with HTTP/1.* in the scenario you described

Best way to determine if new data available

There're many highload sites notify their users about new messages/topics runtime, without page reloading. How do they do that? Which approach do they use?
I assume there's two approaches:
"Asking" the server using JavaScript each time gap
Use websockets
By common opinion, the first one is too heavy for the server, since it produces too many requests.
About second one's behaviour in highload apps I know nothing, is it fine one?
So, which design approach to use to implement functions like "new msg available" properly without the need to reload the page?
The question rather about performance :)

WebSocket performance in the browser is not an issue, and on the server side there are performant implementations. As an example, Crossbar.io can easily handle 180k concurrent connections on a small server (tested in a VM on an older i5 notebook), and 10k/s messages - and both scale with the hardware (RAM and CPU respectively). Also: Something like Crossbar.io/Autobahn/WAMP gives you a protocol on top of WebSockets to handle the distribution of notifications to clients, making your life easier.
Full disclosure: I work for the company that works on Crossbar.io, and there are other WebSocket and PubSub solutions out there. Take a look at what best fits you use case and go with that.

Logic behind concatenating CSS/JS assets

From the Ruby on Rails documentation:
The first feature of the pipeline is to concatenate assets. This is
important in a production environment, because it can reduce the
number of requests that a browser must make to render a web page. Web
browsers are limited in the number of requests that they can make in
parallel, so fewer requests can mean faster loading for your
application.
This is widely considered a best practice around the web. But doesn't conventional logic tell us that loading even three files in parallel is faster than loading a concatenated version serially. So even if there is an upper limit on the number of parallel connections, it should be faster than waiting for one huge file on a single connection. Or does it have to do with the overhead for each request?

The HTTP specifications suggest 4 concurrent connections at the same time. So every browser will be by default set around this number. So, when your page has more than 4 files (including images) it makes sense to concatenate.
For most browsers it is possible to change the number of parallel connections, but that works than only on your machine and not for the user.

We Keep Coding

JavaScript is the programming language of the Web.