Node JS: Will a Bidirection GRPC Call Open Multiple http2 Connections? - javascript

Will a bidirectional RPC call ever open multiple http2 connections?
I'm writing a GRPC client that's talking to a GRPC server I don't own/control. I'm using the #grpc/grpc-js package. I've been asked whether this library will open multiple HTTP2 connections to the grpc endpoint and I'm not familiar enough with the source code to answer this question. My code for making a call and opening a stream looks like this
const protoLoader = require('#grpc/proto-loader')
const packageDefinition = protoLoader.loadSync(
__dirname + '/path/to/v1.proto',
{keepCase: true,
longs: String,
enums: String,
defaults: true,
oneofs: true
})
const packageDefinition = grpc.loadPackageDefinition(packageDefinition).com.foo.bar.v1
const client = new packageDefinition.IngestService(
'server.url.here.com:443',
grpc.credentials.createSsl()
)
const stream = client.doTheThing(metadata)
I've started to look into this myself and I see that it's the Subchannel objects that initiate the http2 connections, so it seems like it's one http2 connection per sub-channel. However, the relationship between the call, the http2call stream, the main channel, the sub-channel(s?), load balancers, and filter stacks is unclear to me, and I can't reason about when (if at all) a second HTTP2 connection would ever be opened.
Ideally, if someone can answer the question Will a Bidirection RPC Call Open Multiple http2 Connections? that would great. If that's too complicated an answer I'd settle for a theory of operation on what the relationship between those various objects is intended to be so I can reason about this myself, or anything else you might think would help.

No matter what streaming type the request is, gRPC will use each single connection to open multiple streams. This is one major reason HTTP/2 was chosen as the underlying protocol for gRPC: multiplexing streams onto connections is already part of that protocol.
Of the classes you mentioned, the Channel is the API-level abstraction over connections. A Channel represents any number of connections to backends referred to by the target string. It will automatically establish connections as needed to handle any requests that are initiated.
The Resolver, which you didn't mention, determines what backend addresses are associated with the target string. For example, the DnsResolver will look up DNS records.
A LoadBalancer determines what specific connections to establish and how to distribute requests among those connections. The default load balancing policy, "pick first" just sends all requests to whichever connection is successfully established first. There is also the "round robin" load balancing policy, which tries to establish connections to multiple backends and then cycles through them when starting calls.
A Subchannel represents a connection to a single backend, that can be reestablished if it drops.
The filter stack applies some transformations to requests between when they are initiated at the top-level API and when they are sent out on the network.

Related

Monitoring mongodb connection to a replicaset in NodeJS

I want to connect to a MongoDB replica set (only one instance to works with change streams) while being able to be notified of connection lost/reconnect.
I followed what described here:
const { MongoClient } = require("mongodb");
// Replace the following with your MongoDB deployment's connection
// string.
const uri =
"mongodb+srv://<clusterUrl>/?replicaSet=rs&writeConcern=majority";
const client = new MongoClient(uri);
// Replace <event name> with the name of the event you are subscribing to.
const eventName = "<event name>";
client.on(eventName, event => {
console.log(`received ${eventName}: ${JSON.stringify(event, null, 2)}`);
});
async function run() {
try {
await client.connect();
// Establish and verify connection
await client.db("admin").command({ ping: 1 });
console.log("Connected successfully");
} finally {
// Ensures that the client will close when you finish/error
await client.close();
}
}
run().catch(console.dir);
I tried subscribing to events:
serverOpening and works fine
serverClosed and I can't understand why but it does not work!!!
No "reconnect" event, any solution?
You are mixing monitoring connections and application connections. Unfortunately the documentation you referenced doesn't talk about this and doesn't document CMAP events so the confusion is understandable. See the Ruby driver docs for a more in-depth explanation of the events that drivers publish (including the Node driver).
Monitoring connections are established by the driver to figure out what server(s) exist in the deployment it was instructed to work with. One (or two depending on driver and server version) such connection is established per known server. You don't control when these connections are established. These connections are NOT used for operations you initiate (inserts/finds etc.). They are only used internally by the driver.
The events published for monitoring connections are server opened, server closed, server heartbeat - the ones listed here. You are going to get these events when the client is instantiated (assuming a spec-compliant client, which the Node one is not in it default configuration as you are using it) without any operations being issued like creating a change stream.
Application connections are established by the driver to satisfy the application's operations like finds and inserts. One of these would be needed for your change stream. The events relevant to these connections are CMAP ones and start with "Connection" or "Pool", e.g. ConnectionCreated. These connections aren't established until you issue an operation, unless you have the min pool size on the client set to a value greater than zero.
If you want to "monitor connections", you can subscribe to either category of events or both.
With that said, both types of connections are managed internally by the driver. You don't get a say in when they are created or destroyed (other than setting min pool size and idle timeouts). So if your goal is to have a working, continuously-running, resuming change stream, you don't need any of this and instead you should be using the proper change stream consumption patterns like the one described here in Ruby syntax although all spec-compliant drivers should provide the equivalent interface.
Lastly, there isn't a "reconnect" event defined in any driver specification. If you have a question specifically about this event you should reference the driver documentation where it is described and read that documentation carefully to ascertain the implemented behavior.

Node.js Cluster for Multiple WebSocket Clients Connecting to Different WebSocket Servers?

I am using Node.js to implement a Websocket client that subscribes to datafeed from multiple Websocket servers.
foo = new WebSocket('ws://foo.host ...')
bar = new WebSocket('ws://barhost ...')
baz = new WebSocket('ws://baz.host ...')
qux = new WebSocket('ws://qux.host ...')
foo.on('data', data => doSomething(data)) // 5 events per second
bar.on('data', data => doSomething(data)) // 1 events per second
baz.on('data', data => doSomething(data)) // 1 events per second
qux.on('data', data => doSomething(data)) // 1 events per second
Question: If we have a multi-core system (eg. 4 cores), is it possible to make use of Node.js Cluster to load balance the processing of the incoming Websocket data, such that each core will approximately receive 2 events per second to be handled?
Or is it better to manually start 8 node.js instances and pass it an argument [foo|bar|baz|qux] for selecting the Websocket server it will connect to?
The nodejs clustering module solves one specific problem. When you have a an http server and you want to load balance incoming connections among multiple processes, that's what the nodejs clustering module does. That is not what you have. You have multiple client-side outgoing webSocket connections and you apparently want to apply multiple processes to processing the incoming data. That's completely different than what the nodejs cluster module does.
First, it's important to understand that receiving the data is not a CPU intensive process for nodejs. The actual socket processing and receiving of incoming data onto the computer is handled by the OS and is outside the nodejs process.
So, if you actually need more than one CPU to work on this, it must be to process the incoming data, not to just receive it.
There are several different ways you could structure that.
You could have one central process that contains all the webSockets and then have and number of worker processes or worker threads that you pass incoming data to for processing. This would apply many CPUs to the processing of the data and would allow the load procesisng to be spread among the CPUs regardless of which socket the data arrived on.
You could create 4 separate child processes and have each child process create one of the four webSocket connections and then have each child process handle just the incoming data for its webSocket. This has the disadvantage that it only applies one process to each webSocket and if most of the data comes on one webSocket, then the other processes will be largely idle.
If one webSocket has a lot more load than the others and for some reason option #1 wouldn't work well, then you could combine #1 and #2. Create a separate process for each webSocket and then have some worker threads for processing the incoming data for each one. Create a work queue that incoming data is inserted into and work can be sent to each worker thread as it finishes its previous chunk of data.

Websockets not connected behind proxy

This is quite common problem, but I cannot find a solution to my specific case. I'm using Glassfish 4.1.1 and my application implements Websockets.
On a client side I'm connecting to WS-server simply by:
var serviceLocation = "ws://" + window.location.host + window.location.pathname + "dialog/";
var wsocket = new WebSocket(serviceLocation + token_var);
On a server side websockets are implemented via #ServerEndpoint functionality and looks very common:
#ServerEndpoint(value = "/dialog/{token}", decoders = DialogMessageDecoder.class)
public class DialogWebsoketEndpoint {
#OnOpen
public void open(final Session session, #PathParam("token") final String token) { ... }
etc.
}
Everything works fine up to the moment when customer tries to connect behind proxy.
Using this test: http://websocketstest.com/ I've found that computer of the customer works behind http-proxy 1.1.
He cannot connect to websockets, onopen simply do not fire at all. wsoscket.readyState never become 1.
How can I tune my ServerEndpoint to make this code work even when customer is connecting behind proxy?
Thank you in advance!
UPDATE: I would provide a screenshot with websocketstest at that computer:
On my computer it seems similarly except one thing:
HTTP Proxy: NO.
Much as the comments to the questions state, it seems the Proxy doesn't support Websockets properly.
This is a common issue (some cell-phone companies have proxies that disrupt websocket connections) and the solution is to use TLS/SSL connections.
The issue comes up mainly because some proxies "correct" (read: corrupt) the Websocket request headers.
However, when using TLS/SSL, the proxies can't read the header data (which is encrypted), causing data "pass-through" on most proxies.
This means the headers will arrive safely at the other end and the proxy will (mostly) ignore the connection... this might still cause an issue where connection timeouts are concerned, but it usually resolves the issue.
EDIT
Notice that the browsers will protect the client from mixing non-encrypted content with encrypted content. Make sure the script initiates the ws connections using the wss variant when TLS/SSL connections are used.

Node JS live text update with CloudMQTT

I have a node server which is connecting to CloudMQTT and receiving messages in app.js. I have my client web app running on the same node server and want to display my messages received in app.js elsewhere in a .ejs file, I'm struggling as to how best to do this.
app.js
// Create a MQTT Client
var mqtt = require('mqtt');
// Create a client connection to CloudMQTT for live data
var client = mqtt.connect('xxxxxxxxxxx', {
username: 'xxxxx',
password: 'xxxxxxx'
});
client.on('connect', function() { // When connected
console.log("Connected to CloudMQTT");
// Subscribe to the temperature
client.subscribe('Motion', function() {
// When a message arrives, do something with it
client.on('message', function(topic, message, packet) {
// ** Need to pass message out **
});
});
});
Basically you need a way for the client (browser code with EJS - HTML, CSS and JS) to receive live updates. There are basically two ways to do this from the client to the node service:
A websocket session instantiated by the client.
A polling approach.
What's the difference?
Under the hood, a websocket is full-duplex communication mechanism. That means that you can open a socket from the client (browser) to the node server and they can talk to each other both ways over a long-lived session. The pro is that updates are often times instantaneous without having to incur the cost of making another HTTP request as in the polling case. The con is that it uses a socket connection that may be long-lived, and there is typically a socket pool on any server that has limited ability to deal with many sockets. There are ways to scale around this issue, but if it's a big concern for you, you may want to go with polling.
Polling is where you set up an endpoint on your server that the client JS code hits every now and then. That endpoint will return you the updated information. The con is that you are now making a new request in order to get updates, which may not be desirable if a lot of updates are expected to come through and the app is expected to be updated in the timeliest manner possible (most of the time polling is sufficient though). The pro is that you do not have a live connection open on the server indefinitely.
Again, there are many more pros and cons, these are just the obvious ones. You decide how to implement it. When the client receives the data from either of these mechanisms, you may update the UI in any suitable manner.
From the server end, you will need a way to persist the information coming from CloudMQTT. There are multiple ways to do this. If you do not care about memory consumption and are ok with potentially throwing away old data if a client does not ask for it for a while, then it may be ok to just store this in memory in a regular javascript object {}. If you do care about persisting the data between server restarts/crashes (probably best), then you can persist to something like Redis, Mongo, any of the SQL stores if your data is relational in nature, or even a regular JSON file on disk (see fs.writeFile).
Hope this helped give you a step in the right direction!

What's the most efficient node.js inter-process communication library/method?

We have few node.js processes that should be able to pass messages,
What's the most efficient way doing that?
How about using node_redis pub/sub
EDIT: the processes might run on different machines
If you want to send messages from one machine to another and do not care about callbacks then Redis pub/sub is the best solution. It's really easy to implement and Redis is really fast.
First you have to install Redis on one of your machines.
Its really easy to connect to Redis:
var client = require('redis').createClient(redis_port, redis_host);
But do not forget about opening Redis port in your firewall!
Then you have to subscribe each machine to some channel:
client.on('ready', function() {
return client.subscribe('your_namespace:machine_name');
});
client.on('message', function(channel, json_message) {
var message;
message = JSON.parse(json_message);
// do whatever you vant with the message
});
You may skip your_namespace and use global namespace, but you will regret it sooner or later.
It's really easy to send messages, too:
var send_message = function(machine_name, message) {
return client.publish("your_namespace:" + machine_name, JSON.stringify(message));
};
If you want to send different kinds of messages, you can use pmessages instead of messages:
client.on('ready', function() {
return client.psubscribe('your_namespace:machine_name:*');
});
client.on('pmessage', function(pattern, channel, json_message) {
// pattern === 'your_namespace:machine_name:*'
// channel === 'your_namespace:machine_name:'+message_type
var message = JSON.parse(message);
var message_type = channel.split(':')[2];
// do whatever you want with the message and message_type
});
send_message = function(machine_name, message_type, message) {
return client.publish([
'your_namespace',
machine_name,
message_type
].join(':'), JSON.stringify(message));
};
The best practice is to name your processes (or machines) by their functionality (e.g. 'send_email'). In that case process (or machine) may be subscribed to more than one channel if it implements more than one functionality.
Actually, it's possible to build a bi-directional communication using redis. But it's more tricky since it would require to add unique callback channel name to each message in order to receive callback without losing context.
So, my conclusion is this: Use Redis if you need "send and forget" communication, investigate another solutions if you need full-fledged bi-directional communication.
Why not use ZeroMQ/0mq for IPC? Redis (a database) is over-kill for doing something as simple as IPC.
Quoting the guide:
ØMQ (ZeroMQ, 0MQ, zmq) looks like an embeddable networking library
but acts like a concurrency framework. It gives you sockets that carry
atomic messages across various transports like in-process,
inter-process, TCP, and multicast. You can connect sockets N-to-N with
patterns like fanout, pub-sub, task distribution, and request-reply.
It's fast enough to be the fabric for clustered products. Its
asynchronous I/O model gives you scalable multicore applications,
built as asynchronous message-processing tasks.
The advantage of using 0MQ (or even vanilla sockets via net library in Node core, minus all the features provided by a 0MQ socket) is that there is no master process. Its broker-less setup is best fit for the scenario you describe. If you are just pushing out messages to various nodes from one central process you can use PUB/SUB socket in 0mq (also supports IP multicast via PGM/EPGM). Apart from that, 0mq also provides for various different socket types (PUSH/PULL/XREP/XREQ/ROUTER/DEALER) with which you can create custom devices.
Start with this excellent guide:
http://zguide.zeromq.org/page:all
For 0MQ 2.x:
http://github.com/JustinTulloss/zeromq.node
For 0MQ 3.x (A fork of the above module. This supports PUBLISHER side filtering for PUBSUB):
http://github.com/shripadk/zeromq.node
More than 4 years after the question being ask there is an interprocess communication module called node-ipc. It supports unix/windows sockets for communication on the same machine as well as TCP, TLS and UDP, claiming that at least sockets, TCP and UDP are stable.
Here is a small example taken from the documentation from the github repository:
Server for Unix Sockets, Windows Sockets & TCP Sockets
var ipc=require('node-ipc');
ipc.config.id = 'world';
ipc.config.retry= 1500;
ipc.serve(
function(){
ipc.server.on(
'message',
function(data,socket){
ipc.log('got a message : '.debug, data);
ipc.server.emit(
socket,
'message',
data+' world!'
);
}
);
}
);
ipc.server.start();
Client for Unix Sockets & TCP Sockets
var ipc=require('node-ipc');
ipc.config.id = 'hello';
ipc.config.retry= 1500;
ipc.connectTo(
'world',
function(){
ipc.of.world.on(
'connect',
function(){
ipc.log('## connected to world ##'.rainbow, ipc.config.delay);
ipc.of.world.emit(
'message',
'hello'
)
}
);
ipc.of.world.on(
'disconnect',
function(){
ipc.log('disconnected from world'.notice);
}
);
ipc.of.world.on(
'message',
function(data){
ipc.log('got a message from world : '.debug, data);
}
);
}
);
Im currently evaluating this module for a replacement local ipc (but could be remote ipc in the future) as a replacement for an old solution via stdin/stdout. Maybe I will expand my answer when I'm done to give some more information how and how good this module works.
i would start with the built in functionality that node provide.
you can use process signalling like:
process.on('SIGINT', function () {
console.log('Got SIGINT. Press Control-D to exit.');
});
this signalling
Emitted when the processes receives a signal. See sigaction(2) for a
list of standard POSIX signal names such as SIGINT, SIGUSR1, etc.
Once you know about process you can spwn a child-process and hook it up to the message event to retrive and send messages. When using child_process.fork() you can write to the child using child.send(message, [sendHandle]) and messages are received by a 'message' event on the child.
Also - you can use cluster. The cluster module allows you to easily create a network of processes that all share server ports.
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', function(worker, code, signal) {
console.log('worker ' + worker.process.pid + ' died');
});
} else {
// Workers can share any TCP connection
// In this case its a HTTP server
http.createServer(function(req, res) {
res.writeHead(200);
res.end("hello world\n");
}).listen(8000);
}
For 3rd party services you can check:
hook.io, signals and bean.
take a look at node-messenger
https://github.com/weixiyen/messenger.js
will fit most needs easily (pub/sub ... fire and forget .. send/request) with automatic maintained connectionpool
we are working on multi-process node app, which is required to handle large number of real-time cross-process message.
We tried redis-pub-sub first, which failed to meet the requirements.
Then tried tcp socket, which was better, but still not the best.
So we switched to UDP datagram, that is much faster.
Here is the code repo, just a few of lines of code.
https://github.com/SGF-Games/node-udpcomm
I needed IPC between web server processes in another language (Perl;) a couple years ago. After investigating IPC via shared memory, and via Unix signals (e.g. SIGINT and signal handlers), and other options, I finally settled on something quite simple which works quite well and is fast. It may not fit the bill if your processes do not all have access to the same file system, however.
The concept is to use the file system as the communication channel. In my world, I have an EVENTS dir, and under it sub dirs to direct the message to the appropriate process: e.g. /EVENTS/1234/player1 and /EVENTS/1234/player2 where 1234 is a particular game with two different players. If a process wants to be aware of all events happening in the game for a particular player, it can listen to /EVENTS/1234/player1 using (in Node.js):
fs.watch
(or fsPromises.watch)
If a process wanted to listen to all events for a particular game, simply watch /EVENTS/1234 with the 'recursive: true' option set for fs.watch. Or watch /EVENTS to see all msgs -- the event produced by fs.watch will tell you the which file path was modified.
For a more concrete example, I my world I have the web browser client of player1 listening for Server-Sent Events (SSE), and there is a loop running in one particular web server process to send those events. Now, a web server process servicing player2 wants to send a message (IPC) to the server process running the SSEs for player1, but doesn't know which process that might be; it simply writes (or modifies) a file in /EVENTS/1234/player1. That directory is being watched -- via fs.watch -- in the web server process handling SSEs for player1. I find this system very flexible, and fast, and it can also be designed to leave a record of all messages sent. I use it so that one random web server process of many can communicate to one other particular web server process, but it could also be used in an N-to-1 or 1-to-N manner.
Hope this helps someone. You're basically letting the OS and the file system do the work for you. Here are a couple links on how this works in MacOS and Linux:
https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/Introduction/Introduction.html#//apple_ref/doc/uid/TP40005289
https://man7.org/linux/man-pages/man7/inotify.7.html
Any module you're using in whatever language is hooking into an API like one of these. It's been 30+ years since I've fiddled much with Windows, so I don't know how file system events work there, but I bet there's an equivalent.
EDIT (more info on different platforms from https://nodejs.org/dist/latest-v19.x/docs/api/fs.html#fswatchfilename-options-listener):
Caveats#
The fs.watch API is not 100% consistent across platforms, and is unavailable in some situations.
On Windows, no events will be emitted if the watched directory is moved or renamed. An EPERM error is reported when the watched directory is deleted.
Availability#
This feature depends on the underlying operating system providing a way to be notified of file system changes.
On Linux systems, this uses inotify(7).
On BSD systems, this uses kqueue(2).
On macOS, this uses kqueue(2) for files and FSEvents for directories.
On SunOS systems (including Solaris and SmartOS), this uses event ports.
On Windows systems, this feature depends on ReadDirectoryChangesW.
On AIX systems, this feature depends on AHAFS, which must be enabled.
On IBM i systems, this feature is not supported.
If the underlying functionality is not available for some reason, then fs.watch() will not be able to function and may throw an exception. For example, watching files or directories can be unreliable, and in some cases impossible, on network file systems (NFS, SMB, etc) or host file systems when using virtualization software such as Vagrant or Docker.
It is still possible to use fs.watchFile(), which uses stat polling, but this method is slower and less reliable.
EDIT2: https://www.npmjs.com/package/node-watch is a wrapper that may help on some platforms
Not everybody knows that pm2 has an API thanks to which you can communicate to its processes.
// pm2-call.js:
import pm2 from "pm2";
pm2.connect(() => {
pm2.sendDataToProcessId(
{
type: "process:msg",
data: {
some: "data",
hello: true,
},
id: 0,
topic: "some topic",
},
(err, res) => {}
);
});
pm2.launchBus((err, bus) => {
bus.on("process:msg", (packet) => {
packet.data.success.should.eql(true);
packet.process.pm_id.should.eql(proc1.pm2_env.pm_id);
done();
});
});
// pm2-app.js:
process.on("message", (packet) => {
process.send({
type: "process:msg",
data: {
success: true,
},
});
});

Categories