How to make millions of parallel http requests from nodejs app?

How to make millions of parallel http requests from nodejs app? - javascript

I have to make a million http calls from my nodejs app.
Apart from doing it using async lib, callbacks is there any other way to call these many requests in parallel to process it much faster?
Kindly suggest me on the same

As the title of your question seems to ask, it's a bit of a folly to actually make millions of parallel requests. Having that many requests in flight at the same time will not help you get the job done any quicker and it will likely exhaust many system resources (memory, sockets, bandwidth, etc...).
Instead, if the goal is to just process millions of requests as fast as possible, then you want to do the following:
Start up enough parallel node.js processes so that you are using all the CPU you have available for processing the request responses. If you have 8 cores in each server involved in the process, then start up 8 node.js processes per server.
Install as much networking bandwidth capability as possible (high throughput connection, multiple network cards, etc...) so you can do the networking as fast as possible.
Use asynchronous I/O processing for all I/O so you are using the system resources as efficiently as possible. Be careful about disk I/O because async disk I/O in node.js actually uses a limited thread pool internal to the node implementation so you can't have an indefinite number of async disk I/O requests actually in flight at the same time. You won't get an error if you try to do this (the excess requests will just be queued), but it won't help you with performance either. Networking in node.js is truly async so it doesn't have this issue.
Open only as many simultaneous requests per node.js process as actually benefit you. How many this is (likely somewhere between 2 and 20) depends upon how much of the total time to process a request is networking vs. CPU and how slow the responses are. If all the requests are going to the same remote server, then saturating it with requests likely won't help you either because you're already asking it to do as much as it can do.
Create a coordination mechanism among your multiple node.js processes to feed each one work and possibly collect results (something like a work queue is often used).
Test like crazy and discover where your bottlenecks are and investigate how to tune or change code to reduce the bottlenecks.
If your requests are all to the same remote server then you will have to figure out how it behaves with multiple requests. A larger server farm will probably not behave much differently if you fire 10 requests at it at once vs. 100 requests at once. But, a single smaller remote server might actually behave worse if you fire 100 requests at it at once. If your requests are all to different hosts, then you don't have this issue at all. If your requests are to a mixture of different hosts and same hosts, then it may pay to spread them around to different hosts so that you aren't making 100 at once of the same host.
The basic ideas behind this are:
You want to maximize your use of the CPU so each CPU is always doing as much as it can.
Since your node.js code is single threaded, you need one node.js process per core in order to maximize your use of the CPU cycles available. Adding additional node.js processes beyond the number of cores will just incur unnecessary OS context switching costs and probably not help performance.
You only need enough parallel requests in flight at the same time to keep the CPU fed with work. Having lots of excess requests in flight beyond what is needed to feed the CPU just increases memory usage beyond what is helpful. If you have enough memory to hold the excess requests, it isn't harmful to have more, but it isn't helpful either. So, ideally you'd set things to have a few more requests in flight at a time than are needed to keep the CPU busy.

Related

Is it okay to run multiple nodejs processes on a single core?

Ok, so the thing is, we have multiple NodeJS web servers which need to be online all the time. But they'll not be recieving many requests, approx. 100-200 requests a day. The tasks aren't CPU intensive either. We are provisioning EC2 instances for it. So, the question is, can we run multiple nodejs processes on a single core? If not, is it possible to run more low intensity NodeJS processes than number of cores present? What are the pros and cons? Any benchmarks available?

Yes, it is possible. The OS (or VM on top of the OS) will simply share the single CPU among the processes allocated to it. If, as you say, you don't have a lot of requests and those requests aren't very CPU hungry, then everything should work just fine and you probably won't even notice that you're sharing a single CPU among a couple server processes. The OS/VM will time slice among the processes using that CPU, but most of the time you won't even have more than one process asking to use the CPU anyway.
Pros/Cons - Really only that performance might momentarily slow down if both servers get CPU-busy at the same time.
Benchmarks - This is highly dependent upon how much CPU your servers are using and when they try to use it. With the small number of requests you're talking about and the fact that they aren't CPU intensive, it's unlikely a user of either server would even notice. Your CPU is going to be idle most of the time.
If you happen to run a request for each server at the exact same moment and that request would normally take 500ms to complete and most of that was not even CPU time, then perhaps each of these two requests might then take 750ms instead (slightly overlapping CPU time that must be shared). But, most of the time, you're not even going to encounter a request from each of your two servers running at the same time because there are so few requests anyway.

Javascript multiple calls or 1 big one

When calling back to the same server, at what point am I better off making one bigger call, versus multiple parallel requests.
In my particular case, assume that the server processing time (not including request processing, etc) is linear (e.g. 1 big call asking for 3 bits of data takes the same processing time as 3 smaller calls).
I know that if I have a 1000 calls, I am better off batching them so as to not incur all the network overhead. But if I only have 2, I'm assuming parallel requests are probably better.
Is this right?
If so, where is the cutoff?

TL;DR: It depends on a number of factors that are highly dependant on your setup. If performance is a huge concern of yours, I would run tests, either with a 3rd party application like Wireshark, or write some performance testing code on the server. In general though, limit your amount of parallel requests to under a handful if possible, by concatenating them.
In general, a few requests (in parallel) are okay. A modern browser will attempt to run them in parallel as much as possible over the TCP stream.
That being said, this starts to get bloated because every single request you make at your server using the HTTP/1.* protocol comes with headers, which can be huge, as they contain things like the referrer and browser cookies. The request body might be one character, but the request itself will be much larger.
Furthermore, the scenario changes with HTTP/2 (or even SPDY), the new transfer protocol. Requests over the wire here are treated differently, and don't always carry the extra weight of all the header metadata that normal requests do. So, if your server and browser support HTTP/2, you might be able to run more requests in parallel.
For the most part, though, you'll be running over HTTP/1.*, which means any more than a couple requests in parallel can see a serious performance impact (in the scenario you described for server processing time) for total completion time over one large load.
There's one other thing to consider though, too, which is application dependant: when does that data matter? If you batch what would have been a ton of small requests into one larger one, none of the return data will come back until the entire operation is complete server-side. If you need to display data more rapidly, or you want to load things step-by-step for slower network conditions, the performance trade-off might be worth it for multiple small requests.
Hope this explanation helps.
Definitely read up on the FAQ for HTTP/2: they also cover some of the same performance issues you'll run into with HTTP/1.* in the scenario you described

Using Node.JS spawn to spawn bash clis for several files

I am creating a program in Node.JS that extract pdf text using the command-line utility pdftotext by creating child_process.spawn for each file. I would like to know if this process is CPU heavy and if it is possible thousands of people to use without breaks anything.
Is create a child_process is heavy? If pdftotext is not multithreading, how can I scale? Do i need load balancing?
Thanks.

Let's break this down a bit:
I would like to know if this process is CPU heavy
I am not sure how CPU intense pdftotext is for a single file. That would also depend on how big each file is, but generally speaking and since the action of extracting PDF to text has no asynchronous work and is CPU bound, I would imagine the process to be CPU heavy, specially with lots of load.
and if it is possible thousands of people to use without breaks anything.
Spawning a new process for every single file or on every single request is generally not a good idea. Spawning a process is an expensive operation that requires a lot of memory. Having thousands of people using your service at the same time would require thousands of processes to be open simultaneously on your server which would cause memory to choke and your server would max at a certain limit and fail after that.
Is create a child_process is heavy? If pdftotext is not multithreading, how can I scale? Do i need load balancing?
As mentioned, spawning a new process is never a cheap operation. It requires memory and resources.
Every file will run in a separate process. Weather pdftotext is implemented to open a single or multiple threads in a process is irrelevant here, either way the process with all it's threads will be competing for machine resources with other processes. Of course it is beneficial if it is implemented in a way that divides work among different threads and can execute in parallel as this makes it faster, however what you would be more concerned about is how long it takes to extract text from a single file i.e. how long the process spends executing.
If you are to run this as a service, you would need to benchmark, optimize and for sure depending on the load you want to support and benchmark results, have to load balance between a few high end machines.
I hope I managed to answer some of your questions.

node.js I/O non-blocking - understanding when it is most beneficial

After reading about event loops and how async works in node.js, this is my understanding of node.js:
Node actually runs processes one at a time and not simultaneously.
Node really shines when multiple databse I/O tasks are called.
It runs faster (than blocking I/O) because it doesn't wait for the response of one call before dealing with the next call. And while dealing with the other call, when the result of the first call arrives, it "gets back to it", basically going back and forth crossing calls and callbacks, without leaving the OS process idle, as opposed to what blocking I/O does. Please correct me if I'm wrong.
But here's my question:
Non-blocking I/O seems to be faster than blocking I/O only if the entity (server/process/thread?) that handles the request sent by node, is not the node server itself.
What would be the cases when the sever handling the request is the same server making the request? If my first bullet is correct, in this case a blocking I/O will work faster than non-blocking if it uses different threads for the task?
Would file compression be an example to such I/O task that works faster on multithreaded blocking I/O?

The main benefit of non-blocking operations is that a relatively heavyweight CPU thread is not kept busy while the server is waiting for something to happen elsewhere (networking, disk I/O, etc...). This means that many different requests can be "in-flight" with only the single CPU thread and no thread is stuck waiting for I/O. A burden is placed back on the developer to write async-friendly code and to use async I/O operations, but in a heavy I/O bound operation, there can be a real benefit to server scalability. The single thread model also really simplifies access to shared resources since there is far, far less opportunity for threading conflicts, deadlocks, etc... This can result in fewer hard-to-find thread synchronization bugs that tend to only nail your server at the worst time (e.g. when it's busy).
Yes, non-blocking I/O only really helps if the agent handling the I/O operation is not node.js itself because the whole point of non-blocking I/O in node is that node is free to use its single thread to go do other things while the I/O operation is running and if it's node that is serving the I/O operation then that wouldn't be true.
Sorry, but I don't understand the part of your question about file compression. File compression takes a certain amount of CPU, no matter who handles it and there are a bunch of different considerations if you were trying to decide whether to handle it inside of node itself or in an outside process (running a different thread). That isn't a simple question. I'd probably start with using whatever code I already had for the compression (e.g. use node code if that's what you had or an external library/process if that's what you had) and only investigate a different option if you actually ran into a performance or scalability issue or knew you had an issue.
FYI, a simple mechanism for handling compression would be to spool the uncompressed data to files in a temporary directory from your node.js app and then have another process (which could be written in any system, even include node) that just looks for files in the temporary directory to which it applies the compression and then does something more permanent with the resulting compressed data.

Minimizing HTTP Connections vs. Parallel Downloads

For years, web developers have followed the logic that minimizing HTTP connections speeds up applications because the browser isn't choking on the download/execution of code. For example Yahoo has long touted their best practices, and tell us to combine CSS/JavaScript/image resources into single files - thereby reducing the total number of HTTP requests and compressing the total resource size.
But other "best practices" exist with regards to increasing webpage speed - specifically, maximizing the number of parallel HTTP downloads (from Google). This approach tells us that by spreading the HTTP connections across multiple hostnames the browser can do more simultaneously.
So as modern web applications are becoming very large (e.g. 3MB+ of JavaScript alone) the question must be asked:
Will my application load faster with 3MB+ of JavaScript in a single file? Or will it load faster with multiple, smaller files spread across hostnames?
For the sake of simplicity we should also assume other "best practices" are being followed, so this question best exists in a vacuum.
I have yet to see any empirical data on the subject, but I imagine there has to be a point where the performance of these approaches diverge - so knowing where that sweet-spot exists would be ideal.

I think this depends on number of sockets available for the browser. Say the browser has it's 4 sockets available, 4 smaller requests will be faster than 1 large request.
The trick here would be knowing at startup what requests your application will send and maximize the # of requests for # of sockets a browser can use. I believe browsers only have 4 but to be honest I haven't ever looked to see if that number has changed in modern browsers.
Looks like each browser can have it's own number of sockets, some having 2: Max parallel http connections in a browser?
https://stackoverflow.com/a/985704/925782 says IE10 is winner with 8 sockets, wow, go IE :)
Cache control would also play part in this of course where first load would be everything, subsequent loads would be less actual requests.
If you want to get geeky: http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/
I agree that some charts and real data would be a great blog post, my response is purely theoretical in nature.

I would pick parallel downloads.
Smaller JS files can be parsed faster than one monster-sized package. In most of the cases you do not need all of the JS at once either.
Concatenating assets is considered better practice currently because of expensive http requests. One of HTTP/2.0 goals is to make it cheap by multiplexing requests within same tcp connection. Server push in HTTP/2.0 can leverage it even more by sending some essential assets to the client ahead of time.
Chrome, FF, Opera and IE11 already support HTTP/2.0, and its support is available for popular webservers (apache, nginx)

We Keep Coding

JavaScript is the programming language of the Web.