I'm still trying to understand what is worker threads, and how it's different from child process so please bear with me.
So I'm currently building a desktop app with Node.JS + Electron. The app would work with several tasks at a time, which some of them are CPU and I/O intensive tasks.
The architecture currently has 1 main process, and numbers of child process which follows the number of host's CPU core count.
The main process handles Electron instance, renderer process, local database process, and handle the other child processes.
Meanwhile child processes would do the other tasks that is CPU and I/O intensive in nature.
So far, I have 4 questions here:
In my case, is it more beneficial to use worker threads instead?
If a tasks requires several package / library, does worker threads will require them first each time the task is run?
Currently, child process has no access to Electron API, thus only main process handles them. Does using worker threads allows me to handle Electron API?
In simple term, what's the difference between worker threads and threads pool? and should I use it instead of the other 2 (child process and worker threads)?
Related
When a Node.js process is spun up top command shows 7 threads attached to the process. What are all these threads doing? Also, as the load on the API increases, with the request handlers themselves asynchronously awaiting other upstream API calls does Node spawn additional worker threads? I see in top that it does that. But I was thinking this only happens for file I/o. Why does it need these additional worker threads?
LIBUV (the underlying cross platform system library that node.js is built-on) uses a thread pool for certain operations such as disk I/O and some crypto operations. By default that thread pool contains 4 threads.
Plus, there is a thread for the execution of your Javascript so that accounts for 5.
Then, it appears there is a thread used by the garbage collector for background marking of objects (per this reference from a V8 developer) and this article. That would make 6.
I don't know for sure what the 7th one would be. It's possible there's a thread used by the event loop itself.
Then, starting some time around 2018, it appears that nodejs switched to a separate set of threads for DNS requests (separate from the file I/O thread pool). This was probably because of problems in node.js where 4 slow DNS requests could block all file I/O because they took over the thread pool. So, now it looks like node.js used the C-ARES library for DNS which makes its own set of threads.
FYI, you can actually control the thread pool size with the UV_THREADPOOL_SIZE environment variable.
And, of course, you can create your own Worker Threads that actually create new instances of the V8 Javascript execution engine (so they will likely end up creating more than one new thread).
I am trying to use worker threads with the worker pool in my application which is intended to be running in a 256mb docker containers.
My main thread is taking around 30mb of memory and 1 worker thread to be taking around 25mb of memory (considering require of third party node modules). Considering this, I would only be able to create a pool of ~7 workers.
But my application requirements are such that it should be able to handle many jobs at a time by creating many workers up and listening for a job (like around 20 or more).
Is there any way wherein I can use the third-party modules like (lodash, request, etc) to be shared across worker threads to save memory it needs in requiring all necessary modules.
My initial thought process was like I can give a try with shared memory (SharedArrayBuffer) but then it will not work as it won't allow passing such complex object structure and functions.
Can anyone help me what can be a possible solution?
Thanks in advance!
In a tutorial I've read that one should you Node's event-loop approach mainly for I/O intensive tasks. Like reading from hard disk or using network. But not for CPU-intensive task.
What's the concrete reason for the quoted statements?
Or the otherwayaround asked:
What would happen if you occupy Node.js with CPU-intesive tasks to do?
Node uses a small number of threads to handle many clients. In Node there are two types of threads: one Event Loop (aka the main loop, main thread, event thread, etc.), and a pool of k Workers in a Worker Pool (aka the threadpool).
If a thread is taking a long time to execute a callback (Event Loop) or a task (Worker), we call it "blocked". While a thread is blocked working on behalf of one client, it cannot handle requests from any other clients.
You can read more about it in official nodejs guide
I'm working on a small side project and would like to grow it out, but I'm not too sure how. My question is, how should I design my NodeJs worker application to be able to execute multiple long running jobs at the same time? (i.e. should I be using multiprocessing libraries, a load-balancer, etc)
My current situation is that I have a NodeJs app running purely to serve web requests and put jobs on a queue, while another NodeJs app reading off that queue carries out those jobs (on a heroku worker dyno). Each job may take anywhere from 1 hour to 1 week of purely writing to a database. Due to the nature of the job, and it requiring an npm package specifically, I feel like I should be using Node, but at the same time I'm not sure it's the best option when considering I would like to scale it so that hundreds of jobs can be executed at the same time.
Any advice/suggestions as to how I should architect this design would be appreciated. Thank you.
First off, a single node.js app can handle lots of jobs that are just reading/writing from a database because those activities are mostly asynchronous which means node.js is spending most of its time doing nothing while waiting for the database to respond back from the last request. So, you could probably have a single node.js app handle literally at least hundreds of jobs, perhaps even thousands of jobs (depending upon exactly what the jobs are doing). In fact, I wouldn't be surprised if a single node.js app could throw more work at your database than the database could possibly keep up with.
Then, if you want to scale how many worker node.js apps are running these jobs, you can simply fire up as many worker apps as you want (and as many as your hardware can handle) using the child_process module. You create one central work queue in your main node.js app. Then, create a bunch of child_processes whose job it is to grab N items from the work queue and process them. Note, I suggest you grab N items at once because a single node.js process can probably work on many separate jobs at once because of asynchronous I/O to your database.
You may also want to explore the cluster module which doesn't even need a work queue. You can just fire up as many clustered instances of your main app as you want and they can all share the workload (both serving web pages and working on the long running jobs). The usual guideline is to set up a clustered instance for each CPU you have in the computer. So, if you have 4 cores, you would set up a cluster with a total of four servers in it.
There are a lot info about application (node), and render(chromium) processes in Electron. About the communication between these processes via data marshalling through IPC and separated contexts.
But there is no info about event-loop.
So there is the question. How many event loops are used in Electron: several event-loops (one for each process: app, renders) and then how does libuv work with it?; or is there one event-loop which is shared between these processes?