Can Socket.io emits arrive out of order? What if volatile?

Can Socket.io emits arrive out of order? What if volatile? - javascript

I've been looking around for a definitive answer to this but I seem to keep finding contradictory answers (ex this and this).
Basically, if I
socket.emit('game_update', {n: 1});
from a node.js server and then, 20 ms later,
socket.emit('game_update', {n: 2});
from the same server, is there any way that the n:2 message arrives before the n:1 message? In other words, does the n:1 message "block" the receiving of the n:2 message if the n:1 message somehow got lost on the way?
What if they were volatile emits? My understanding is that the n:1 message wouldn't block the n:2 message -- if the n:1 message got dropped, the n:2 message would still be received whenever it arrived.
Background: I'm building a node.js game server and want to better understand how my game updates are traveling. I'm using volatile emit right now and I would like to increase the server's tick rate, but I want to make sure that independent game updates wouldn't block each other. I would rather the client receive an update every 30 ms with a few dropped updates scattered here and there than have the client receive an update, receive nothing for 200 ms, and then receive 6 more updates all at once.

Disclaimer: I'm not completely familiar with the internals of socket.io.
is there any way that the n:2 message arrives before the n:1 message?
It depends on the transport that you're using. For the polling transport, I think it's fair to say that it's perfectly possible for messages to arrive out-of-order, because each message can arrive over a different connection.
With the websocket transport, which maintains a persistent connection, the message order is reasonably guaranteed.
What if they were volatile emits?
With volatile emits, all bets are off, it's fire-and-forget. I think that in normal situations, the server will wait (and queue up messages) for a client to be ready to receive messages, unless those messages are volatile, in which case the server will just drop them.
From what you're saying, I think that volatile emits are what you want, although once a websocket connection has been established I don't think you'll see the described scenario ("receive an update, receive nothing for 200 ms, and then receive 6 more updates all at once") is likely to happen. Perhaps only when the connection gets lost and is re-established.

The answer is yes it can possibly arrive later, but it is highly unlikely given that sockets are by nature persistent connections and reliability of order is all but guaranteed.
According to the Socket.io documentation messages will be discarded in the case that the client is not connected. This doesn't necessarily fit with your use case, however within the documentation itself it describes Volatile events as an interesting example if you need to send the position of a character.
// server-side
io.on("connection", (socket) => {
console.log("connect");
socket.on("ping", (count) => {
console.log(count);
});
});
// client-side
let count = 0;
setInterval(() => {
socket.volatile.emit("ping", ++count);
}, 1000);
If you restart the server, you will see in the console:
connect
1
2
3
4
# the server is restarted, the client automatically reconnects
connect
9
10
11
Without the volatile flag, you would see:
connect
1
2
3
4
# the server is restarted, the client automatically reconnects and sends its
buffered events
connect
5
6
7
8
9
10
11
Note: The documentation explicitly states that this will happen during a server restart, meaning that your connection to the client likely has to be lost in order for the volatile emits to be dropped.
I would say a good practice would be to write your emits as volatile just in case you do get a dropped client, however this will depend heavily on your game requirements.
As for the goal, I would recommend that you use client side prediction using some sort of dynamic time system or deltatime based on the client and server keeping a sync clock to help alleviate some of the problems you can incur. Here's an example of how you can do that, though I'm not a fan of the creators syntax, it can be easily adapted to your needs.
Hope this helps anyone who hits this topic.
Socket.io - Volatile events
Client Side Prediction

Related

Messages being delayed when using websockets

I have a program which is using the Websocket TCP: The client is an extension in Chrome and the server is an application written in C++.
When I send small data from the client to the server, it works fine. But when I send large amounts of data (e.g. a source html page), it will be slightly delayed.
For Example:
Client sends: 1,2,3
Server receives: 1,2
Client sends: 4
Server receives: 3
Client sends: 5
Server receives: 4
It's seems like it's a delay.
This is my code client:
var m_cWebsocket = new WebSocket("Servername");
if (m_cWebsocket == null) { return false; }
m_cWebsocket.onopen = onWebsocketOpen(m_cWebsocket); m_cWebsocket.onmessage = onWebsocketMessage;
m_cWebsocket.onerror = onWebsocketError;
m_cWebsocket.onclose = onWebsocketError;
I using m_cWebsocket.send(strMsg) to send data.
Server code
while (true) { recv(sSocket, szBufferTmp, 99990, 0); //recv(sSocket,
szBufferTmp, 99990, MSG_PEEK); //some process }

Since you haven't posted any code to show your implementation of the TCP server or client I can only speculate and try to explain what might be going on here.
That means the potential problems and solutions I outline below may or may not apply to you, but regardless this information should still be helpful to others who might find this question in the future.
TL;DR: (most likely) It's either the server is too slow, the server is not properly waiting for complete 'tcp packets' to be buffered, or the server doesn't know when to properly start and stop and is de-synching while it waits for what it thinks is a 'full packet' as defined by something like a buffer size.
It sounds to me like you are pushing data from the client either faster than the server the server can read, or more likely, the server is buffering a set number of bytes from the current TCP Stream and waiting for the buffer to fill before outputting additional data.
If you are sending this over localhost it's unlikely you are not close to limit of the stream though, and I would expect a server written in C++ would be able to keep up with the javascript client.
So this leads me to believe that the issue is in fact the stream buffer on the C++ side.
Now since the server has no way to know to what data you are sending and or how much of it you are sending, it is common for a TCP stream to utilize a stream buffer that contiguously reads data from the socket until either the buffer has filled to a known size, or until it sees a predefined 'stop character'. This would usually be something like a "line end" or \n character, sometimes \n\r (line feed, carriage feed) depending on your operating system.
Since you haven't specified how you are receiving your data, I'm going to just assume you created either a char or byte buffer of a certain size. I'm a pretty rusty on my C++ socket information so I might be wrong, but I do believe there is a default 'read timeout' on C++ tcp streams as well.
This means you are possibly running into 1 of 2 issues.
Situation 1) You are waiting until that byte/char buffer is filled before outputing it's data. Issue is that will act like a bus that only leaves the station when all seats are filled. If you don't fill all the seats, you server is just sitting and waiting until it gets more data to fill up fully and output your data.
Situation 2) You are running up against the socket read timeout and therefore the function is not getting all the data before outputting the data. This is like a bus that is running by the clock. Every 10 minutes that bus leaves the station, doesn't matter if that bus is full or empty, it's leaving and the next bus will pick up anyone who shows up late. In your case, the TCP stream isn't able to load 1, 2 and 3 onto a bus fast enough, so the bus leaves with just 1, 2 on it because after 20ms of not receiving data, the server is exiting from the function and outputing the data. On the next loop however, there is 3 waiting at the top of the stream buffer ready to get on the next bus out. The Stream will load 3, wait til those 20ms are finished, and then exit before repeating this loop.
I think it's more likely the first situation is occurring though, as I would expect the server to either start catching up, or falling further behind as the 2 servers either begin to sync together, or have internall TPC stream buffer fill up as the server falls further and further behind.
Main point here, you need some way to synchronize the client and the server connections. I would recommend sending a "start byte" and "End byte" to single when a message has begun and finished, so you don't exit the function too early.
Or send a start byte, followed by the packet size in bytes, then filling up the buffer until your buffer has the correct numbers of bytes. Additionally you could include an end byte as well for some basic error checking.
This is a pretty involved topic and hard to really give you a good answer without any code from you, but this should also help anyone in the future who might be having a similar issue.
EDIT I went back and re-read your question and noticed you said it was only with large amounts of data, so I think my original assumption was wrong, and it's more likely situation 2 because the client is sending the data to your server faster than the server can read it, and thus might be bottle necking the connection and the client is only able to send additional data once the server has emptied part of it's TCP stream buffer.
Think of it like a tube of of water. The socket (tube) can only accept (fill up) with so much data (water) before it's full. Once you let some water out the bottom though, you can fill it up a little bit more. The only reason it works for small files is that the file is too small to fill the entire tube.
Additional thoughts: You can see how I was approaching this problem in C# in this question: Continuously reading from serial port asynchronously properly
And another similar question I had previously (again in C#): How to use Task.WhenAny with ReadLineAsync to get data from any TcpClient
It's been awhile since I've played with TCP streams though, so my apologies in that I don't remember all the niche details and caveats of the protocal, but hopefully this information is enough to get you in the ball park for solving your problem.
Full disclaimer, it's been over 2 years since I last touched C++ TCP sockets, and have since worked with sockets/websockets in other languages (such as C# and JavaScript), so I may have some facts wrong about the behavior of C++ TCP sockets specifically, but the core information should still apply. If I got anything wrong, someone in the comments will most likely have the correct information.
Lastly, welcome to stack overflow!

Google Pub/Sub How to set read timeout on Pull

I would like to set the read timeout of the pull request on a subscription. Right now the only options are to set returnImmediately=true or just wait until the pubsub returns, which seems to be 90 seconds if no messages is published.
I'm using the gcloud-node module to make calls to pubsub. It uses the request module under the hood to make the the gcloud api calls. I've updated my local copy of gcloud-node/lib/pubsub/subscription.js to set the request timeout to 30 seconds
this.request({
method: 'POST',
uri: ':pull',
timeout: 30000,
json: {
returnImmediately: !!options.returnImmediately,
maxMessages: options.maxResults
}
}
When I do this, the behavior I see is the connection will timeout on the client side after 30 seconds, but pubsub still has the request open. If I have two clients pulling on the subscription and one of them timeout after 30 seconds, then a message is published to the topic, it is a 50/50 chance that the remaining listening client will retrieve the message.
Is there a way to tell pubsub to timeout pull connections after a certain amount of time?
UPDATE:
I probably need to clarify my example a bit. I have two clients that connect at the same time and pull from the same subscription. The only difference between the two is that the first one is configured to timeout after 30 seconds. Since two clients are connected to the same subscription, pubsub will distribute the message load between the two of them. If I publish a message 45 seconds after both clients connect, there is a 50/50 chance that pubsub will deliver the message to the second client that has not timed out yet. If I send 10 messages instead of just one, the second client will receive a subset of the 10 messages. It looks like this is because my clients are in a long poll. If the client disconnects, the server has no idea and will try to send published messages on the response of the request that was made by the client that has timed out. From my tests, this is the behavior I've observed. What I would like to do is be able to send a timeout param in the pull request to tell subpub to send back a response after a 30000ms if no messages are published during that time. Reading over the API docs, this doesn't seem like an option.

Setting the request timeout as you have is the correct way to timeout the pull after 30 seconds. The existence of the canceled request might not be what is causing the other pull to not get the message immediately. If your second pull (that does not time out) manages to pull other messages that were published earlier, it won't necessarily wait for additional message that was published after the timeout to come in before completing. It only guarantees to not return more than maxMessages, not to return only once it has exactly maxMessages (if that many are available). Once your publish completes, some later pull will get the message, but there are no guarantees on exactly when that will occur.

Node JS live text update with CloudMQTT

I have a node server which is connecting to CloudMQTT and receiving messages in app.js. I have my client web app running on the same node server and want to display my messages received in app.js elsewhere in a .ejs file, I'm struggling as to how best to do this.
app.js
// Create a MQTT Client
var mqtt = require('mqtt');
// Create a client connection to CloudMQTT for live data
var client = mqtt.connect('xxxxxxxxxxx', {
username: 'xxxxx',
password: 'xxxxxxx'
});
client.on('connect', function() { // When connected
console.log("Connected to CloudMQTT");
// Subscribe to the temperature
client.subscribe('Motion', function() {
// When a message arrives, do something with it
client.on('message', function(topic, message, packet) {
// ** Need to pass message out **
});
});
});

Basically you need a way for the client (browser code with EJS - HTML, CSS and JS) to receive live updates. There are basically two ways to do this from the client to the node service:
A websocket session instantiated by the client.
A polling approach.
What's the difference?
Under the hood, a websocket is full-duplex communication mechanism. That means that you can open a socket from the client (browser) to the node server and they can talk to each other both ways over a long-lived session. The pro is that updates are often times instantaneous without having to incur the cost of making another HTTP request as in the polling case. The con is that it uses a socket connection that may be long-lived, and there is typically a socket pool on any server that has limited ability to deal with many sockets. There are ways to scale around this issue, but if it's a big concern for you, you may want to go with polling.
Polling is where you set up an endpoint on your server that the client JS code hits every now and then. That endpoint will return you the updated information. The con is that you are now making a new request in order to get updates, which may not be desirable if a lot of updates are expected to come through and the app is expected to be updated in the timeliest manner possible (most of the time polling is sufficient though). The pro is that you do not have a live connection open on the server indefinitely.
Again, there are many more pros and cons, these are just the obvious ones. You decide how to implement it. When the client receives the data from either of these mechanisms, you may update the UI in any suitable manner.
From the server end, you will need a way to persist the information coming from CloudMQTT. There are multiple ways to do this. If you do not care about memory consumption and are ok with potentially throwing away old data if a client does not ask for it for a while, then it may be ok to just store this in memory in a regular javascript object {}. If you do care about persisting the data between server restarts/crashes (probably best), then you can persist to something like Redis, Mongo, any of the SQL stores if your data is relational in nature, or even a regular JSON file on disk (see fs.writeFile).
Hope this helped give you a step in the right direction!

Is it expensive/efficient to send data between processes in Node?

Node allows you to spawn child processes and send data between them. You could use it do execute some blocking code for example.
Documentation says "These child Nodes are still whole new instances of V8. Assume at least 30ms startup and 10mb memory for each new Node. That is, you cannot create many thousands of them."
I was wondering if is it efficient, should I worry about some limitations? Here's example code:
//index.js
var childProcess1 = childProcess.fork('./child1.js');
childProcess1.send(largeArray);
childProcess1.once('message', function(formattedData) {
console.log(formattedData);
return false;
});
//child1.js
process.on('message', function(data) {
data = format(data); //do smth with data, then send it back to index.js
try{
process.send(data);
return false;
}
catch(err){
console.log(err);
return false;
}
});

The documentation is telling you that starting new node processes is (relatively) expensive. It is unwise to fork() every time you need to do work.
Instead, you should maintain a pool of long-running worker processes – much like a thread pool. Queue work requests in your main process and dispatch them to the next available worker when it goes idle.
This leaves us with a question about the performance profile of node's IPC mechanism. When you fork(), node automatically sets up a special file descriptor on the child process. It uses this to communicate between processes by reading and writing line-delimited JSON. Basically, when you process.send({ ... }), node JSON.stringifys it and writes the serialized string to the fd. The receiving process reads this data until hitting a line break, then JSON.parses it.
This necessarily means that performance will be highly dependent on the size of the data you send between processes.
I've roughed out some tests to get a better idea of what this performance looks like.
First, I sent a message of N bytes to the worker, which immediately responded with a message of the same length. I tried this with 1 to 8 concurrent workers on my quad-core hyper-threaded i7.
We can see that having at least 2 workers is beneficial for raw throughput, but more than 2 essentially doesn't matter.
Next, I sent an empty message to the worker, which immediately responded with a message of N bytes.
Surprisingly, this made no difference.
Finally, I tried sending a message of N bytes to the worker, which immediately responded with an empty message.
Interesting — performance does not degrade as rapidly with larger messages.
Takeaways
Receiving large messages is slightly more expensive than sending them. For best throughput, your master process should not send messages larger than 1 kB and should not receive messages back larger than 128 bytes.
For small messages, the IPC overhead is about 0.02ms. This is small enough to be inconsequential in the real world.
It is important to realize that the serialization of the message is a synchronous, blocking call; if the overhead is too large, your entire node process will be frozen while the message is sent. This means I/O will be starved and you will be unable to process any other events (like incoming HTTP requests). So what is the maximum amount of data that can be sent over node IPC?
Things get really nasty over 32 kB. (These are per-message; double to get roundtrip overhead.)
The moral of the story is that you should:
If the input is larger than 32 kB, find a way to have your worker fetch the actual dataset. If you're pulling the data from a database or some other network location, do the request in the worker. Don't have the master fetch the data and then try to send it in a message. The message should contain only enough information for the worker to do its job. Think of messages like function parameters.
If the output is larger than 32 kB, find a way to have the worker deliver the result outside of a message. Write to disk or send the socket to the worker so that you can respond directly from the worker process.

This really depends on your server resources and the number of nodes you need to spin up.
As a rule of thumb:
Try reusing running children as much as possible - this will save you 30ms start up time
Do not start unlimited number of children (1 per request for instance) - you will not run out of RAM
The messaging itself it relatively fast i believe. Would be great to see some metrics though.
Also, note that if you have single CPU or running a cluster (using all available cores) it doesn't make much sense. You still have limited CPU capacity and switching context is more expensive than running single process.

Rate-limiting the data sent down WebSockets

We're sending a lot of data down a websocket (from a Node.js app to the web browser).
The data is binary data in the form of blobs.
Occasionally, the end-user is on a poor connection - and in this case, we'd like to 'skip' messages (leave them out) and make sure we don't cram down more data than the user can receive.
On the server side, we have tried:
function sendBlob(blob, socket) {
console.log('socket.bufferedAmount: ' + socket.bufferedAmount); // Always zero
if (socket.bufferedAmount > 0) {
return; // Never called
}
socket.send(blob);
}
Unfortunately bufferedAmount always returns zero.
Is this the right way to see how much data is being queued but not sent/received in websockets, or is there a better way to achieve this?
(Have also tried logging socket.bufferedAmount on the client-side, but it also always returns zero).

The socket.bufferedAmount property that exists on clients (as well as the ws module for Node) is the amount of bytes that it itself has buffered, not the remote. That means socket.bufferedAmount on the server means how many bytes that are waiting to be sent to the client, and for the client it is how many bytes are waiting to be sent to the server.
The reason you aren't getting any change in the property is that your network is probably indeed sufficient to deliver the data. If you actually want to see a difference in socket.bufferedAmount, then try throttling your browser network access. This can be done with browser extensions or tools like NetLimiter.
If you want to throttle connections by skipping messages, you can think about creating some type of heartbeat system between the client and server. There are many ways you could do this, such as applying this function:
setInterval(function() {
if (socket.bufferedAmount == 0) {
socket.send('heartbeat');
}
}, 1000);
And then detecting missed heartbeats by counting the time interval. This is rather inefficient, but there's also other ways to do this such as responding to sent data from the server (although take into consideration that if you want to send a heartbeat when receiving data, the heartbeat itself might get throttled or other side effects).
An alternative solution would also be available if you were willing to switch to Socket.IO. It has a feature that allows you to send volatile messages, which are messages that are dropped if the client is busy or is not able to accept messages for any reason.
var io = require('socket.io').listen(80);
io.sockets.on('connection', function (socket) {
var timer = setInterval(function () {
socket.volatile.emit('data', 'payload');
}, 100);
socket.on('disconnect', function () {
clearInterval(timer);
});
});
Do note that Socket.IO will be heavier on your application, and doesn't use the native websocket protocol. It will utilize websockets when it is an option, but it is one of many transports. Socket.IO is built on Engine.IO, which uses a fork of the module you're currently using.

The readonly attribute bufferedAmount represents the number of bytes of UTF-8 text that have been queued using send() method.
And your case here shows that you are trying to access it on message received from server.
so the bufferedAmount is not set.

We Keep Coding

JavaScript is the programming language of the Web.