Node Cluster is not dispatching task to another worker available

Node Cluster is not dispatching task to another worker available - javascript

This is my first question of Stack-overflow so please pardon me for any mistake or insufficient information in this question.
So, I am trying to use cluster module of nodeJS for my server and I run nodeJS through my windows machine. I know nodeJS does not have any scheduling policy for cluster module in windows so I have explicitly set the scheduling_policy to rr as mentioned by nodeJS docs.
But the problem is when I am trying to keep one worker busy by putting it in an infinite loop; server is not dispatching the request to another worker available and free when we tried to request the server for '/' resource.
Please help me why it is not dispatching the request to other workers.
var cluster=require('cluster');
if(cluster.isMaster){
var cores=require('os').cpus().length;
console.log("Master Cluster setting up :-"+cores+" workers");
for(var i=0;i<cores;i++)
cluster.fork();
cluster.on('online',(worker)=>{
console.log("Worker with Process ID :- "+worker.process.pid+" online");
});
cluster.on('exit',(worker)=>{
console.log("worker "+worker.process.pid+" died...So setting up a new worker");
cluster.fork();
});
}
else{
var app=require('express')();
app.get('/',(req,res)=>{
console.log("Process with pid "+process.pid+" is handling this request");
while(true);
res.write("Yes!");
res.end();
//while(true);
})
app.listen('3000');
}

The above code you written working good. But you need to send concurrent requests to your node server instead of sending requesting in loop then, you will see the power of node cluster module.
Node cluster module has two approaches for distributing incoming connections.
The first one (and the default one on all platforms except Windows), is the round-robin approach.
The second approach is where the master process creates the listen socket and sends it to interested workers. The workers then accept incoming connections directly
You can use http://blog.remarkablelabs.com/2012/11/benchmarking-and-load-testing-with-siege for sending concurrent request.Now you will see switching to your process.

Related

How to run simultaneous Node child processes

TL;DR: I have an endpoint on an Express server that runs some cpu-bound logic in a child_process. The problem is that if the server gets more than one request for that endpoint it won't run both requests simultaneously- it queues them up and runs them one-at-a-time. Is there a way to use Node child_process so that my server will perform multiple child processes simultaneously?
Long-Version: The major downfall of Node is that it is single-threaded and a logic-heavy (cpu-bound) request can make the server stop dead in its tracks so that it can't take anymore requests until that logic is finished running. I thought that I could work around this using child_process, which is working great in freeing up my server to take other requests. BUT- it will only execute child_processes one at a time, creating a queue that can get pretty backed-up. I also have a Node cluster setup so that my server is split into 8 separate "virtual servers" (8-core machine), so I guess I can technically run 8 of these child processes at once, but I want to be able to handle more traffic than that. Looking for a solution that will still allow me to use Node and Express, please only suggest using different technologies if you are absolutely sure this can't be efficiently done in my current environment. Thanks in advance for the help!
Endpoint:
app.get('/cpu-exec-file', function(req, res) {
child_process.execFile('node', ['./blocking_tasks/mathCruncher.js'], {timeout:30000}, function(err, stdout, stderr) {
var data = JSON.parse(stdout);
res.send(data);
})
});
mathCruncher.js:
var obj = {}
function myLoop (i) {
setTimeout(function () {
obj[i] = Math.random() * 100;
if (--i) {
myLoop(i);
} else {
string = JSON.stringify(obj);
console.log(string); // goes to stdout.
}
}, 1000)
};
myLoop(10);

Is there a way to use Node child_process so that my server will perform multiple child processes simultaneously?
message queue and back-end process.
i do exactly what you're wanting, using RabbitMQ. there are several other great messaging systems out there, like ZeroMQ and even Redis w/ some pub-sub libraries on top of it.
the gist of it is to send a request to your queueing system and have another process pick up the message, then run the process to do the work.
if you need a response from the worker, you can use bi-directional messaging with either a Request/Reply setup, or use status messages for really-long-running things.
if you're interested in the RabbitMQ side of things, I have a free email course on various patterns with RabbitMQ, including Request/Reply and status emails: http://derickbailey.com/email-courses/rabbitmq-patterns-for-applications/
and if you're interested in ground-up training on RMQ w/ Node, check out my training course at http://rabbitmq4devs.com

Node JS live text update with CloudMQTT

I have a node server which is connecting to CloudMQTT and receiving messages in app.js. I have my client web app running on the same node server and want to display my messages received in app.js elsewhere in a .ejs file, I'm struggling as to how best to do this.
app.js
// Create a MQTT Client
var mqtt = require('mqtt');
// Create a client connection to CloudMQTT for live data
var client = mqtt.connect('xxxxxxxxxxx', {
username: 'xxxxx',
password: 'xxxxxxx'
});
client.on('connect', function() { // When connected
console.log("Connected to CloudMQTT");
// Subscribe to the temperature
client.subscribe('Motion', function() {
// When a message arrives, do something with it
client.on('message', function(topic, message, packet) {
// ** Need to pass message out **
});
});
});

Basically you need a way for the client (browser code with EJS - HTML, CSS and JS) to receive live updates. There are basically two ways to do this from the client to the node service:
A websocket session instantiated by the client.
A polling approach.
What's the difference?
Under the hood, a websocket is full-duplex communication mechanism. That means that you can open a socket from the client (browser) to the node server and they can talk to each other both ways over a long-lived session. The pro is that updates are often times instantaneous without having to incur the cost of making another HTTP request as in the polling case. The con is that it uses a socket connection that may be long-lived, and there is typically a socket pool on any server that has limited ability to deal with many sockets. There are ways to scale around this issue, but if it's a big concern for you, you may want to go with polling.
Polling is where you set up an endpoint on your server that the client JS code hits every now and then. That endpoint will return you the updated information. The con is that you are now making a new request in order to get updates, which may not be desirable if a lot of updates are expected to come through and the app is expected to be updated in the timeliest manner possible (most of the time polling is sufficient though). The pro is that you do not have a live connection open on the server indefinitely.
Again, there are many more pros and cons, these are just the obvious ones. You decide how to implement it. When the client receives the data from either of these mechanisms, you may update the UI in any suitable manner.
From the server end, you will need a way to persist the information coming from CloudMQTT. There are multiple ways to do this. If you do not care about memory consumption and are ok with potentially throwing away old data if a client does not ask for it for a while, then it may be ok to just store this in memory in a regular javascript object {}. If you do care about persisting the data between server restarts/crashes (probably best), then you can persist to something like Redis, Mongo, any of the SQL stores if your data is relational in nature, or even a regular JSON file on disk (see fs.writeFile).
Hope this helped give you a step in the right direction!

In NodeJS, how do I re-establish a socket connection with another server that may have gone down?

So, I have a Express NodeJS server that is making a connection with another app via an upagraded WebSocket uri for a data feed. If this app goes down, then obviously the WebSocket connection gets closed. I need to reconnect with this uri once the app comes back online.
My first approach was to use a while loop in the socket.onclose function to keep attempting to make the re-connection once the app comes back online, but this didn't seem to work as planned. My code looks like this:
socket.onclose = function(){
while(socket.readyState != 1){
try{
socket = new WebSocket("URI");
console.log("connection status: " + socket.readyState);
}
catch(err) {
//send message to console
}
}
};
This approach keeps giving me a socket.readyState of 0, even after the app the URI is accessing is back online.
Another approach I took was to use the JavaScript setTimout function to attempt to make the connection by using an exponential backoff algorithm. Using this approach, my code in the socket.onclose function looks like this:
socket.onclose = function(){
var time = generateInterval(reconnAttempts); //generateInterval generates the random time based on the exponential backoff algorithm
setTimeout(function(){
reconnAttempts++; //another attempt so increment reconnAttempts
socket = new WebSocket("URI");
}, time);
};
The problem with this attempt is that if the app is still offline when the socket connection is attempted, I get the following error, for obvious reasons, and the node script terminates:
events.js:85
throw er; // Unhandled 'error' event
Error: connect ECONNREFUSED
at exports._errnoException (util.js:746:11)
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1010:19)
I also began using the forever node module to ensure that my node script is always running and to make sure it gets restarted after an unexpected exit. Even though I'm using forever, after a few restarts, forever just stops the script anyway.
I am basically just looking for a way to make my NodeJS server more robust and automatically re-connect with another server that may have gone down for some reason, instead of having to manually restart the node script.
Am I completely off base with my attempts? I am a noob when it comes to NodeJS so it may even be something stupid that I'm overlooking, but I have been researching this for a day or so now and all of my attempts don't seem to work as planned.
Any suggestions would be greatly appreciated! Thanks!

Few suggestions
1) Start using domain which prevents your app from an unexpected termination. Ie your app will run under the domain(run method of domain). You can implement some alert mechanism such as email or sms to which will notify when any error occurs.
2) Start using socket.io for websocket communication, it automatically handles the reconnection. Socket.io uses keep-alive heartbeat and continuously polls from the server.
3) Start using pm2 instead of forever. Pm2 allows clustering for your app which improves the performance.
I think this may improve your app's performance, stability and robustness.

Node.js: Closing all Redis clients on shutdown

Today, I integrated Redis into my node.js application and am using it as a session store. Basically, upon successful authentication, I store the corresponding user object in Redis.
When I receive http requests after authentication, I attempt to retrieve the user object from Redis using a hash. If the retrieval was successful, that means the user is logged in and the request can be fulfilled.
The act of storing the user object in Redis and the retrieval happen in two different files, so I have one Redis client in each file.
Question 1:
Is it ok having two Redis clients, one in each file? Or should I instantiate only one client and use it across all areas of the application?
Question 2:
Does the node-redis library provide a method to show a list of connected clients? If it does, I will be able to iterate through the list, and call client.quit() for each of them when the server is shutting down.
By the way, this is how I'm implementing the "graceful shutdown" of the server:
//Gracefully shutdown and perform clean-up when kill signal is received
process.on('SIGINT', cleanup);
process.on('SIGTERM', cleanup);
function cleanup() {
server.stop(function() {
//todo: quit all connected redis clients
console.log('Server stopped.');
//exit the process
process.exit();
});
};

In terms of design and performance, it's best to create one client and use it across your application. This is pretty easy to do in node. I'm assuming you're using the redis npm package.
First, create a file named redis.js with the following contents:
const redis = require('redis');
const RedisClient = (function() {
return redis.createClient();
})();
module.exports = RedisClient
Then, say in a file set.js, you would use it as so:
const client = require('./redis');
client.set('key', 'value');
Then, in your index.js file, you can import it and close the connection on exit:
const client = require('./redis');
process.on('SIGINT', cleanup);
process.on('SIGTERM', cleanup);
function cleanup() {
client.quit(function() {
console.log('Redis client stopped.');
server.stop(function() {
console.log('Server stopped.');
process.exit();
});
});
};

Using multiple connections may be required by how the application uses Redis.
For instance, as soon as a connection is used the purpose of listening to a pub/sub channel, then it can only be used for this and nothing else. Per the documentation on SUBSCRIBE:
Once the client enters the subscribed state it is not supposed to issue any other commands, except for additional SUBSCRIBE, PSUBSCRIBE, UNSUBSCRIBE and PUNSUBSCRIBE commands.
So if your application needs to subscribe to channels and use Redis as general value cache, then it needs two clients at a minimum: one for subscribing to channels and one for using Redis as a cache.
There are also Redis commands that are blocking like BLPOP. A busy web server normally replies to multiple requests at once. Suppose that for answering request A the server uses its Redis client to issue a blocking command. Then request B comes and the server needs to answer Redis with a non-blocking command but the client is still waiting for the blocking command issued for request A to finish. Now the response to request B is delayed by another request. This can be avoided by using a different client for the second request.
If you do not use any of the facilities that require more than one connection, then you can and should use just one connection.
If the way you use Redis is such that you need more than one connection, and you just need a list of connections but no sophisticated connection management, you could just create your own factory function: it would call redis.createClient() and save the client before returning it. Then at shutdown time, you could go over the list of saved clients and close them. Unfortunately, node-redis does not provide such functionality built-in.
If you need more sophisticated client management than the factory function described above, then the typical way to manage the multiple connections created is to use a connection pool but node-redis does not provide one. I usually access Redis through Python code so I don't have a recommendation for Node.js libraries, but an npm search shows quite a few candidates.

What's the most efficient node.js inter-process communication library/method?

We have few node.js processes that should be able to pass messages,
What's the most efficient way doing that?
How about using node_redis pub/sub
EDIT: the processes might run on different machines

If you want to send messages from one machine to another and do not care about callbacks then Redis pub/sub is the best solution. It's really easy to implement and Redis is really fast.
First you have to install Redis on one of your machines.
Its really easy to connect to Redis:
var client = require('redis').createClient(redis_port, redis_host);
But do not forget about opening Redis port in your firewall!
Then you have to subscribe each machine to some channel:
client.on('ready', function() {
return client.subscribe('your_namespace:machine_name');
});
client.on('message', function(channel, json_message) {
var message;
message = JSON.parse(json_message);
// do whatever you vant with the message
});
You may skip your_namespace and use global namespace, but you will regret it sooner or later.
It's really easy to send messages, too:
var send_message = function(machine_name, message) {
return client.publish("your_namespace:" + machine_name, JSON.stringify(message));
};
If you want to send different kinds of messages, you can use pmessages instead of messages:
client.on('ready', function() {
return client.psubscribe('your_namespace:machine_name:*');
});
client.on('pmessage', function(pattern, channel, json_message) {
// pattern === 'your_namespace:machine_name:*'
// channel === 'your_namespace:machine_name:'+message_type
var message = JSON.parse(message);
var message_type = channel.split(':')[2];
// do whatever you want with the message and message_type
});
send_message = function(machine_name, message_type, message) {
return client.publish([
'your_namespace',
machine_name,
message_type
].join(':'), JSON.stringify(message));
};
The best practice is to name your processes (or machines) by their functionality (e.g. 'send_email'). In that case process (or machine) may be subscribed to more than one channel if it implements more than one functionality.
Actually, it's possible to build a bi-directional communication using redis. But it's more tricky since it would require to add unique callback channel name to each message in order to receive callback without losing context.
So, my conclusion is this: Use Redis if you need "send and forget" communication, investigate another solutions if you need full-fledged bi-directional communication.

Why not use ZeroMQ/0mq for IPC? Redis (a database) is over-kill for doing something as simple as IPC.
Quoting the guide:
ØMQ (ZeroMQ, 0MQ, zmq) looks like an embeddable networking library
but acts like a concurrency framework. It gives you sockets that carry
atomic messages across various transports like in-process,
inter-process, TCP, and multicast. You can connect sockets N-to-N with
patterns like fanout, pub-sub, task distribution, and request-reply.
It's fast enough to be the fabric for clustered products. Its
asynchronous I/O model gives you scalable multicore applications,
built as asynchronous message-processing tasks.
The advantage of using 0MQ (or even vanilla sockets via net library in Node core, minus all the features provided by a 0MQ socket) is that there is no master process. Its broker-less setup is best fit for the scenario you describe. If you are just pushing out messages to various nodes from one central process you can use PUB/SUB socket in 0mq (also supports IP multicast via PGM/EPGM). Apart from that, 0mq also provides for various different socket types (PUSH/PULL/XREP/XREQ/ROUTER/DEALER) with which you can create custom devices.
Start with this excellent guide:
http://zguide.zeromq.org/page:all
For 0MQ 2.x:
http://github.com/JustinTulloss/zeromq.node
For 0MQ 3.x (A fork of the above module. This supports PUBLISHER side filtering for PUBSUB):
http://github.com/shripadk/zeromq.node

More than 4 years after the question being ask there is an interprocess communication module called node-ipc. It supports unix/windows sockets for communication on the same machine as well as TCP, TLS and UDP, claiming that at least sockets, TCP and UDP are stable.
Here is a small example taken from the documentation from the github repository:
Server for Unix Sockets, Windows Sockets & TCP Sockets
var ipc=require('node-ipc');
ipc.config.id = 'world';
ipc.config.retry= 1500;
ipc.serve(
function(){
ipc.server.on(
'message',
function(data,socket){
ipc.log('got a message : '.debug, data);
ipc.server.emit(
socket,
'message',
data+' world!'
);
}
);
}
);
ipc.server.start();
Client for Unix Sockets & TCP Sockets
var ipc=require('node-ipc');
ipc.config.id = 'hello';
ipc.config.retry= 1500;
ipc.connectTo(
'world',
function(){
ipc.of.world.on(
'connect',
function(){
ipc.log('## connected to world ##'.rainbow, ipc.config.delay);
ipc.of.world.emit(
'message',
'hello'
)
}
);
ipc.of.world.on(
'disconnect',
function(){
ipc.log('disconnected from world'.notice);
}
);
ipc.of.world.on(
'message',
function(data){
ipc.log('got a message from world : '.debug, data);
}
);
}
);
Im currently evaluating this module for a replacement local ipc (but could be remote ipc in the future) as a replacement for an old solution via stdin/stdout. Maybe I will expand my answer when I'm done to give some more information how and how good this module works.

i would start with the built in functionality that node provide.
you can use process signalling like:
process.on('SIGINT', function () {
console.log('Got SIGINT. Press Control-D to exit.');
});
this signalling
Emitted when the processes receives a signal. See sigaction(2) for a
list of standard POSIX signal names such as SIGINT, SIGUSR1, etc.
Once you know about process you can spwn a child-process and hook it up to the message event to retrive and send messages. When using child_process.fork() you can write to the child using child.send(message, [sendHandle]) and messages are received by a 'message' event on the child.
Also - you can use cluster. The cluster module allows you to easily create a network of processes that all share server ports.
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', function(worker, code, signal) {
console.log('worker ' + worker.process.pid + ' died');
});
} else {
// Workers can share any TCP connection
// In this case its a HTTP server
http.createServer(function(req, res) {
res.writeHead(200);
res.end("hello world\n");
}).listen(8000);
}
For 3rd party services you can check:
hook.io, signals and bean.

take a look at node-messenger
https://github.com/weixiyen/messenger.js
will fit most needs easily (pub/sub ... fire and forget .. send/request) with automatic maintained connectionpool

we are working on multi-process node app, which is required to handle large number of real-time cross-process message.
We tried redis-pub-sub first, which failed to meet the requirements.
Then tried tcp socket, which was better, but still not the best.
So we switched to UDP datagram, that is much faster.
Here is the code repo, just a few of lines of code.
https://github.com/SGF-Games/node-udpcomm

I needed IPC between web server processes in another language (Perl;) a couple years ago. After investigating IPC via shared memory, and via Unix signals (e.g. SIGINT and signal handlers), and other options, I finally settled on something quite simple which works quite well and is fast. It may not fit the bill if your processes do not all have access to the same file system, however.
The concept is to use the file system as the communication channel. In my world, I have an EVENTS dir, and under it sub dirs to direct the message to the appropriate process: e.g. /EVENTS/1234/player1 and /EVENTS/1234/player2 where 1234 is a particular game with two different players. If a process wants to be aware of all events happening in the game for a particular player, it can listen to /EVENTS/1234/player1 using (in Node.js):
fs.watch
(or fsPromises.watch)
If a process wanted to listen to all events for a particular game, simply watch /EVENTS/1234 with the 'recursive: true' option set for fs.watch. Or watch /EVENTS to see all msgs -- the event produced by fs.watch will tell you the which file path was modified.
For a more concrete example, I my world I have the web browser client of player1 listening for Server-Sent Events (SSE), and there is a loop running in one particular web server process to send those events. Now, a web server process servicing player2 wants to send a message (IPC) to the server process running the SSEs for player1, but doesn't know which process that might be; it simply writes (or modifies) a file in /EVENTS/1234/player1. That directory is being watched -- via fs.watch -- in the web server process handling SSEs for player1. I find this system very flexible, and fast, and it can also be designed to leave a record of all messages sent. I use it so that one random web server process of many can communicate to one other particular web server process, but it could also be used in an N-to-1 or 1-to-N manner.
Hope this helps someone. You're basically letting the OS and the file system do the work for you. Here are a couple links on how this works in MacOS and Linux:
https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/Introduction/Introduction.html#//apple_ref/doc/uid/TP40005289
https://man7.org/linux/man-pages/man7/inotify.7.html
Any module you're using in whatever language is hooking into an API like one of these. It's been 30+ years since I've fiddled much with Windows, so I don't know how file system events work there, but I bet there's an equivalent.
EDIT (more info on different platforms from https://nodejs.org/dist/latest-v19.x/docs/api/fs.html#fswatchfilename-options-listener):
Caveats#
The fs.watch API is not 100% consistent across platforms, and is unavailable in some situations.
On Windows, no events will be emitted if the watched directory is moved or renamed. An EPERM error is reported when the watched directory is deleted.
Availability#
This feature depends on the underlying operating system providing a way to be notified of file system changes.
On Linux systems, this uses inotify(7).
On BSD systems, this uses kqueue(2).
On macOS, this uses kqueue(2) for files and FSEvents for directories.
On SunOS systems (including Solaris and SmartOS), this uses event ports.
On Windows systems, this feature depends on ReadDirectoryChangesW.
On AIX systems, this feature depends on AHAFS, which must be enabled.
On IBM i systems, this feature is not supported.
If the underlying functionality is not available for some reason, then fs.watch() will not be able to function and may throw an exception. For example, watching files or directories can be unreliable, and in some cases impossible, on network file systems (NFS, SMB, etc) or host file systems when using virtualization software such as Vagrant or Docker.
It is still possible to use fs.watchFile(), which uses stat polling, but this method is slower and less reliable.
EDIT2: https://www.npmjs.com/package/node-watch is a wrapper that may help on some platforms

Not everybody knows that pm2 has an API thanks to which you can communicate to its processes.
// pm2-call.js:
import pm2 from "pm2";
pm2.connect(() => {
pm2.sendDataToProcessId(
{
type: "process:msg",
data: {
some: "data",
hello: true,
},
id: 0,
topic: "some topic",
},
(err, res) => {}
);
});
pm2.launchBus((err, bus) => {
bus.on("process:msg", (packet) => {
packet.data.success.should.eql(true);
packet.process.pm_id.should.eql(proc1.pm2_env.pm_id);
done();
});
});
// pm2-app.js:
process.on("message", (packet) => {
process.send({
type: "process:msg",
data: {
success: true,
},
});
});

We Keep Coding

JavaScript is the programming language of the Web.