When calling back to the same server, at what point am I better off making one bigger call, versus multiple parallel requests.
In my particular case, assume that the server processing time (not including request processing, etc) is linear (e.g. 1 big call asking for 3 bits of data takes the same processing time as 3 smaller calls).
I know that if I have a 1000 calls, I am better off batching them so as to not incur all the network overhead. But if I only have 2, I'm assuming parallel requests are probably better.
Is this right?
If so, where is the cutoff?
TL;DR: It depends on a number of factors that are highly dependant on your setup. If performance is a huge concern of yours, I would run tests, either with a 3rd party application like Wireshark, or write some performance testing code on the server. In general though, limit your amount of parallel requests to under a handful if possible, by concatenating them.
In general, a few requests (in parallel) are okay. A modern browser will attempt to run them in parallel as much as possible over the TCP stream.
That being said, this starts to get bloated because every single request you make at your server using the HTTP/1.* protocol comes with headers, which can be huge, as they contain things like the referrer and browser cookies. The request body might be one character, but the request itself will be much larger.
Furthermore, the scenario changes with HTTP/2 (or even SPDY), the new transfer protocol. Requests over the wire here are treated differently, and don't always carry the extra weight of all the header metadata that normal requests do. So, if your server and browser support HTTP/2, you might be able to run more requests in parallel.
For the most part, though, you'll be running over HTTP/1.*, which means any more than a couple requests in parallel can see a serious performance impact (in the scenario you described for server processing time) for total completion time over one large load.
There's one other thing to consider though, too, which is application dependant: when does that data matter? If you batch what would have been a ton of small requests into one larger one, none of the return data will come back until the entire operation is complete server-side. If you need to display data more rapidly, or you want to load things step-by-step for slower network conditions, the performance trade-off might be worth it for multiple small requests.
Hope this explanation helps.
Definitely read up on the FAQ for HTTP/2: they also cover some of the same performance issues you'll run into with HTTP/1.* in the scenario you described
Related
Ok, so the thing is, we have multiple NodeJS web servers which need to be online all the time. But they'll not be recieving many requests, approx. 100-200 requests a day. The tasks aren't CPU intensive either. We are provisioning EC2 instances for it. So, the question is, can we run multiple nodejs processes on a single core? If not, is it possible to run more low intensity NodeJS processes than number of cores present? What are the pros and cons? Any benchmarks available?
Yes, it is possible. The OS (or VM on top of the OS) will simply share the single CPU among the processes allocated to it. If, as you say, you don't have a lot of requests and those requests aren't very CPU hungry, then everything should work just fine and you probably won't even notice that you're sharing a single CPU among a couple server processes. The OS/VM will time slice among the processes using that CPU, but most of the time you won't even have more than one process asking to use the CPU anyway.
Pros/Cons - Really only that performance might momentarily slow down if both servers get CPU-busy at the same time.
Benchmarks - This is highly dependent upon how much CPU your servers are using and when they try to use it. With the small number of requests you're talking about and the fact that they aren't CPU intensive, it's unlikely a user of either server would even notice. Your CPU is going to be idle most of the time.
If you happen to run a request for each server at the exact same moment and that request would normally take 500ms to complete and most of that was not even CPU time, then perhaps each of these two requests might then take 750ms instead (slightly overlapping CPU time that must be shared). But, most of the time, you're not even going to encounter a request from each of your two servers running at the same time because there are so few requests anyway.
So I was looking at this module, and I cannot understand why it uses callbacks.
I thought that memory caching is supposed to be fast and that is also the purpose someone would use caching, because it's fast... like instant.
Adding callbacks implies that you need to wait for something.
But how much you need to wait actually? If the result gets back to you very fast, aren't you slowing things down by wrapping everything in callbacks + promises on top (because as a user of this module you are forced to promisify those callbacks) ?
By design, javascript is asynchronous for most of its external calls (http, 3rd parties libraries, ...).
As mention here
Javascript is a single threaded language. This means it has one call stack and one memory heap. As expected, it executes code in order and must finish executing a piece code before moving onto the next. It's synchronous, but at times that can be harmful. For example, if a function takes a while to execute or has to wait on something, it freezes everything up in the meanwhile.
Having synchronous function will block the thread and the execution of the script. To avoid any blocking (due to networking, file access, etc...), it is recommended to get these information asynchronously.
Most of the time, the redis caching will take a few ms. However, this is preventing a possible network lag and will keep your application up and running with a tiny amount of connectivity errors for your customers.
TLDR: reading from memory is fast, reading from network is slower and shouldn't block the process
You're right. Reading from memory cache is very fast. It's as fast as accessing any variable (micro or nano seconds), so there is no good reason to implement this as a callback/promise as it will be significantly slower. But this is only true if you're reading from the nodejs process memory.
The problem with using redis from node, is that the memory cache is stored on another machine (redis server) or at least another process. So the even if redis reads the data very quickly, it still has to go through the network to return to your node server, which isn't always guaranteed to be fast (usually few milliseconds at least). For example, if you're using a redis server which is not physically close to your nodejs server, or you have too many network requests, ... the request can take longer to reach redis and return back to your server. Now imagine if this was blocking by default, it would prevent your server from doing anything else until the request is complete. Which will result in a very poor performance as your server is sitting idle waiting for the network trip. That's the reason why any I/O (disk, network, ..) operation in nodejs should be async by default.
Alex, you remarked with "I thought that memory caching is supposed to be fast and that is also the purpose someone would use caching, because it's fast... like instant." And you're near being wholly right.
Now, what does Redis actually mean?
It means REmote DIctionary Server.
~ FAQ - Redis
Yes, a dictionary usually performs in O(1) time. However, do note that the perception of the said performance is effective from the facade of procedures running inside the process holding the dictionary. Therefore, access to the memory owned by the Redis process from another process, is a channel of operations that is not O(1).
So, because Redis is a REmote DIctionary Server asynchronous APIs are required to access its service.
As it has already been answered here, your redis instance could be on your machine (and accessing redis RAM storage is nearly as fast as accessing a regular javascript variable) but it could also be an another machine/cloud/cluster/you name it. And in that case, network latency could be problematic, that's why the promises/callbacks syntax.
If you are 100% confident that your Redis instance would always lay on the same machine your code is, that having some blocking asynchronous calls to it is fine, you could just use the ES6 await syntax to write it as blocking synchronous events and avoid the callbacks or the promises :)
But I'm not sure it is worth it, in term of coding habit and scalability. But every project is different and that could suits you.
We all know that we should minify our JS, CSS and sprite our images. Not only to reduce the size being sent, but also to reduce the number of HTTP requests being sent. And that's what my question is about - what is the reason for limiting parallel AJAX requests by browsers (https://stackoverflow.com/a/14768266/769384 - more details on the limitations)?
AJAX is asynchronous, we don't know when it finishes. The more we parallelize, the less we'd have to wait. Limiting number of requests makes the initial page loading more synchronous-alike, since we have to wait, blocked, until the AJAX slots become free to be used again.
I do have some thoughts on that - but I'd really appreciate a good explanation.
Browsers limit the number of parallel connections to
improve HTTP response times and avoid congestion
This is recommended in the HTTP 1.1 Specfication:
A single-user client SHOULD NOT maintain more than 2 connections with
any server or proxy.
More practically, while perhaps convenient for load testing, not imposing a hard limit can degrade browser performance, and even cause an unintended DoS attack (or your client incorrectly being classified as a DoS attacker by the server).
I have to make a million http calls from my nodejs app.
Apart from doing it using async lib, callbacks is there any other way to call these many requests in parallel to process it much faster?
Kindly suggest me on the same
As the title of your question seems to ask, it's a bit of a folly to actually make millions of parallel requests. Having that many requests in flight at the same time will not help you get the job done any quicker and it will likely exhaust many system resources (memory, sockets, bandwidth, etc...).
Instead, if the goal is to just process millions of requests as fast as possible, then you want to do the following:
Start up enough parallel node.js processes so that you are using all the CPU you have available for processing the request responses. If you have 8 cores in each server involved in the process, then start up 8 node.js processes per server.
Install as much networking bandwidth capability as possible (high throughput connection, multiple network cards, etc...) so you can do the networking as fast as possible.
Use asynchronous I/O processing for all I/O so you are using the system resources as efficiently as possible. Be careful about disk I/O because async disk I/O in node.js actually uses a limited thread pool internal to the node implementation so you can't have an indefinite number of async disk I/O requests actually in flight at the same time. You won't get an error if you try to do this (the excess requests will just be queued), but it won't help you with performance either. Networking in node.js is truly async so it doesn't have this issue.
Open only as many simultaneous requests per node.js process as actually benefit you. How many this is (likely somewhere between 2 and 20) depends upon how much of the total time to process a request is networking vs. CPU and how slow the responses are. If all the requests are going to the same remote server, then saturating it with requests likely won't help you either because you're already asking it to do as much as it can do.
Create a coordination mechanism among your multiple node.js processes to feed each one work and possibly collect results (something like a work queue is often used).
Test like crazy and discover where your bottlenecks are and investigate how to tune or change code to reduce the bottlenecks.
If your requests are all to the same remote server then you will have to figure out how it behaves with multiple requests. A larger server farm will probably not behave much differently if you fire 10 requests at it at once vs. 100 requests at once. But, a single smaller remote server might actually behave worse if you fire 100 requests at it at once. If your requests are all to different hosts, then you don't have this issue at all. If your requests are to a mixture of different hosts and same hosts, then it may pay to spread them around to different hosts so that you aren't making 100 at once of the same host.
The basic ideas behind this are:
You want to maximize your use of the CPU so each CPU is always doing as much as it can.
Since your node.js code is single threaded, you need one node.js process per core in order to maximize your use of the CPU cycles available. Adding additional node.js processes beyond the number of cores will just incur unnecessary OS context switching costs and probably not help performance.
You only need enough parallel requests in flight at the same time to keep the CPU fed with work. Having lots of excess requests in flight beyond what is needed to feed the CPU just increases memory usage beyond what is helpful. If you have enough memory to hold the excess requests, it isn't harmful to have more, but it isn't helpful either. So, ideally you'd set things to have a few more requests in flight at a time than are needed to keep the CPU busy.
From the Ruby on Rails documentation:
The first feature of the pipeline is to concatenate assets. This is
important in a production environment, because it can reduce the
number of requests that a browser must make to render a web page. Web
browsers are limited in the number of requests that they can make in
parallel, so fewer requests can mean faster loading for your
application.
This is widely considered a best practice around the web. But doesn't conventional logic tell us that loading even three files in parallel is faster than loading a concatenated version serially. So even if there is an upper limit on the number of parallel connections, it should be faster than waiting for one huge file on a single connection. Or does it have to do with the overhead for each request?
The HTTP specifications suggest 4 concurrent connections at the same time. So every browser will be by default set around this number. So, when your page has more than 4 files (including images) it makes sense to concatenate.
For most browsers it is possible to change the number of parallel connections, but that works than only on your machine and not for the user.