How to connect socket.io clients using a loop - javascript

I have a number of socket.io servers serving on iterative ports (port 4001, 4002, 4003, etc)
I want to connect each socket client to the corresponding servers using a loop:
connectSockets = (sensors) => {
const responses = {};
for (const [idx, sensor] of sensors.entries()) {
const socket = socketIOClient(`${endpointBase}:${port + idx}`);
socket.on(`From::${sensor}`, data => {
responses[sensor] = data
});
}
this.setState({
responses
});
};
When I break inside the loop on:
responses[sensor] = data
I can see the data. It is even getting assigned to the appropriate "responses" property.
However, when I get out of the loop and break in setState I see:
responses = {}
No idea why. Scoping issue of some kind? Or maybe I am confused as to how socket.io works - first time using it. Interestingly when I break on "const socket" i get three iterations as I expect, but when I break on "responses[sensor] = data" I get a lot more than 3 iterations. Anyone have any ideas?

Well I figured it out on my own. As it turns out it was a mis-understanding as to how socket.io works. As t.niese mentions the callback is async.
What i needed to do was set the state within the socket connection since "data" will not necessarily be available when the loop completes:
socket.on(`From::${sensor}`, data => {
responses[sensor] = data
this.setState({responses});
});
Worked perfectly.

It is not as you assumed in your question (or your answer) as scoping issue, but related to when certain pats of the code are execute. If you place some console.log (or breakpoints in your code) it will become obvious:
connectSockets = (sensors) => {
const responses = {};
console.log('before loop')
for (const [idx, sensor] of sensors.entries()) {
const socket = socketIOClient(`${endpointBase}:${port + idx}`);
socket.on(`From::${sensor}`, data => {
console.log(`socket.on callback for: ${sensor}`)
responses[sensor] = data
});
}
console.log('after loop')
this.setState({responses});
};
The output of in the console, or the order in which those breakpoints are reach is:
before loop
after loop
socket.on callback for: ...
...
socket.on callback for: ...
The event handlers registered with .on are called at the time when an even happens, but because JavaScript is not multi threaded, the event handling cannot happen at the same time when you register an event handler with .on.
You can solve that - as you already figured out - by moving the this.setState({responses}) to the event callback.Now you call this.setState multiple times. If this is a problem, depends on your setState function.

responses[sensor] = data is triggered asynchronously at a later pointer in time.
this.setState({responses}) is being called in the same event loop. This is the reason you wouldn't find it populated with anything.

Related

Are JavaScript event loop operations on variables blocking?

In the non-blocking event loop of JavaScript, is it safe to read and then alter a variable? What happens if two processes want to change a variable nearly at the same time?
Example A:
Process 1: Get variable A (it is 100)
Process 2: Get variable A (it is 100)
Process 1: Add 1 (it is 101)
Process 2: Add 1 (it is 101)
Result: Variable A is 101 instead of 102
Here is a simplified example, having an Express route. Lets say the route gets called 1000 per second:
let counter = 0;
const getCounter = () => {
return counter;
};
const setCounter = (newValue) => {
counter = newValue;
};
app.get('/counter', (req, res) => {
const currentValue = getCounter();
const newValue = currentValue + 1;
setCounter(newValue);
});
Example B:
What if we do something more complex like Array.findIndex() and then Array.splice()? Could it be that the found index has become outdated because another event-process already altered the array?
Process A findIndex (it is 12000)
Process B findIndex (it is 34000)
Process A splice index 12000
Process B splice index 34000
Result: Process B removed the wrong index, should have removed 33999 instead
const veryLargeArray = [
// ...
];
app.get('/remove', (req, res) => {
const id = req.query.id;
const i = veryLargeArray.findIndex(val => val.id === id);
veryLargeArray.splice(i, 1);
});
Example C:
What if we add an async operation into Example B?
const veryLargeArray = [
// ...
];
app.get('/remove', (req, res) => {
const id = req.query.id;
const i = veryLargeArray.findIndex(val => val.id === id);
someAsyncFunction().then(() => {
veryLargeArray.splice(i, 1);
});
});
This question was kind of hard to find the right words to describe it. Please feel free to update the title.
As per #ThisIsNoZaku's link, Javascript has a 'Run To Completion' principle:
Each message is processed completely before any other message is processed.
This offers some nice properties when reasoning about your program, including the fact that whenever a function runs, it cannot be pre-empted and will run entirely before any other code runs (and can modify data the function manipulates). This differs from C, for instance, where if a function runs in a thread, it may be stopped at any point by the runtime system to run some other code in another thread.
A downside of this model is that if a message takes too long to complete, the web application is unable to process user interactions like click or scroll. The browser mitigates this with the "a script is taking too long to run" dialog. A good practice to follow is to make message processing short and if possible cut down one message into several messages.
Further reading: https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop
So, for:
Example A: This works perfectly fine as a sitecounter.
Example B: This works perfectly fine as well, but if many requests happen at the same time then the last request submitted will be waiting quite some time.
Example C: If another call to \remove is sent before someAsyncFunction finishes, then it is entirely possible that your array will be invalid. The way to resolve this would be to move the index finding into the .then clause of the async function.
IMO, at the cost of latency, this solves a lot of potentially painful concurrency problems. If you must optimise the speed of your requests, then my advice would be to look into different architectures (additional caching, etc).

How to stop reading child_added with Firebase cloud functions?

I try to get all 10 records using this:
exports.checkchanges = functions.database.ref('school/{class}').onCreate(snap => {
const class=snap.params.class;
var ref = admin.database().ref('/students')
return ref.orderByChild(class).startAt('-').on("child_added", function(snapshot) {
const age=snapshot.child("age");
// do the thing
})
)}
The problem is that after I get the 10 records I need correctly, even after few days when a new record is added meeting those terms, this function is still invoked.
When I change on("child_added to once("child_added I get only 1 record instead of 10. And when I change on("child_added to on("value I get null on this:
const age=snapshot.child("age");
So how can I prevent the function from being invoked for future changes?
When you implement database interactions in Cloud Functions, it is important to have a deterministic end condition. Otherwise the Cloud Functions environment doesn't know when your code is done, and it may either kill it too soon, or keep it running (and thus billing you) longer than is necessary.
The problem with your code is that you attach a listener with on and then never remove it. In addition (since on() doesn't return a promise), Cloud Functions doesn't know that you're done. The result is that your on() listener may live indefinitely.
That's why in most Cloud Functions that use the Realtime Database, you'll see them using once(). To get all children with a once(), we'll listen for the value event:
exports.checkchanges = functions.database.ref('school/{class}').onCreate(snap => {
const class=snap.params.class;
var ref = admin.database().ref('/students')
return ref.orderByChild(class).startAt('-').limitToFirst(10).once("value", function(snapshot) {
snapshot.forEach(function(child) {
const age=child.child("age");
// do the thing
});
})
)}
I added a limitToFirst(10), since you indicated that you only need 10 children.

Semaphore equivalent in Node js , variable getting modified in concurrent request?

I am facing this issue for the past 1 week and I am just confused about this.
Keeping it short and simple to explain the problem.
We have an in memory Model which stores values like budget etc.Now when a call is made to the API it has a spent associated with it.
We then check the in memory model and add the spent to the existing spend and then check to the budget and if it exceeds we donot accept any more clicks of that model. for each call we also udpate the db but that is a async operation.
A short example
api.get('/clk/:spent/:id', function(req, res) {
checkbudget(spent, id);
}
checkbudget(spent, id){
var obj = in memory model[id]
obj.spent+= spent;
obj.spent > obj.budjet // if greater.
obj.status = 11 // 11 is the stopped status
update db and rebuild model.
}
This used to work fine but now with concurrent requests we are getting false spends out spends increase more than budget and it stops after some time. We simulated the call with j meter and found this.
As far as we could find node is async so by the time the status is updated to 11 many threads have already updated the spent for the campaign.
How to have a semaphore kind of logic for Node.js so that the variable budget is in sync with the model
update
db.addSpend(campaignId, spent, function(err, data) {
campaign.spent += spent;
var totalSpent = (+camp.spent) + (+camp.cpb);
if (totalSpent > camp.budget) {
logger.info('Stopping it..');
camp.status = 11; // in-memory stop
var History = [];
History.push(some data);
db.stopCamp(campId, function(err, data) {
if (err) {
logger.error('Error while stopping );
}
model.campMAP = buildCatMap(model);
model.campKeyMap = buildKeyMap(model);
db.campEventHistory(cpcHistory, false, function(err) {
if (err) {
logger.error(Error);
}
})
});
}
});
GIST of the code can anyone help now please
Q: Is there semaphore or equivalent in NodeJs?
A: No.
Q: Then how do NodeJs users deal with race condition?
A: In theory you shouldn't have to as there is no thread in javascript.
Before going deeper into my proposed solution I think it is important for you to know how NodeJs works.
For NodeJs it is driven by an event based architecture. This means that in the Node process there is an event queue that contains all the "to-do" events.
When an event gets pop from the queue, node will execute all of the required code until it is finished. Any async calls that were made during the run were spawned as other events and they are queued up in the event queue until a response is heard back and it is time to run them again.
Q: So what can I do to ensure that only 1 request can perform updates to the database at a time?
A: I believe there are many ways you can achieve this but one of the easier way out is to use the set_timeout API.
Example:
api.get('/clk/:spent/:id', function(req, res) {
var data = {
id: id
spending: spent
}
canProceed(data, /*functions to exec after canProceed=*/ checkbudget);
}
var canProceed = function(data, next) {
var model = in memory model[id];
if (model.is_updating) {
set_timeout(isUpdating(data, next), /*try again in=*/1000/*milliseconds*/);
}
else {
// lock is released. Proceed.
next(data.spending, data.id)
}
}
checkbudget(spent, id){
var obj = in memory model[id]
obj.is_updating = true; // Lock this model
obj.spent+= spent;
obj.spent > obj.budjet // if greater.
obj.status = 11 // 11 is the stopped status
update db and rebuild model.
obj.is_updating = false; // Unlock the model
}
Note: What I got here is pseudo code as well so you'll may have to tweak it a bit.
The idea here is to have a flag in your model to indicate whether a HTTP request can proceed to do the critical code path. In this case your checkbudget function and beyond.
When a request comes in it checks the is_updating flag to see if it can proceed. If it is true then it schedules an event, to be fired in a second later, this "setTimeout" basically becomes an event and gets placed into node's event queue for later processing
When this event gets fired later, the checks again. This occurs until the is_update flag becomes false then the request goes on to do its stuff and is_update is set to false again when all the critical code is done.
Not the most efficient way but it gets the job done, you can always revisit the solution when performance becomes a problem.

What happens with unhandled socket.io events?

Does socket.io ignore\drop them?
The reason why Im asking this is the following.
There is a client with several states. Each state has its own set of socket handlers. At different moments server notifies the client of state change and after that sends several state dependent messages.
But! It takes some time for the client to change state and to set new handlers. In this case client can miss some msgs... because there are no handlers at the moment.
If I understand correctly unhandled msgs are lost for client.
May be I miss the concept or do smth wrong... How to hanle this issues?
Unhandled messages are just ignored. It's just like when an event occurs and there are no event listeners for that event. The socket receives the msg and doesn't find a handler for it so nothing happens with it.
You could avoid missing messages by always having the handlers installed and then deciding in the handlers (based on other state) whether to do anything with the message or not.
jfriend00's answer is a good one, and you are probably fine just leaving the handlers in place and using logic in the callback to ignore events as needed. If you really want to manage the unhandled packets though, read on...
You can get the list of callbacks from the socket internals, and use it to compare to the incoming message header. This client-side code will do just that.
// Save a copy of the onevent function
socket._onevent = socket.onevent;
// Replace the onevent function with a handler that captures all messages
socket.onevent = function (packet) {
// Compare the list of callbacks to the incoming event name
if( !Object.keys(socket._callbacks).map(x => x.substr(1)).includes(packet.data[0]) ) {
console.log(`WARNING: Unhandled Event: ${packet.data}`);
}
socket._onevent.apply(socket, Array.prototype.slice.call(arguments));
};
The object socket._callbacks contains the callbacks and the keys are the names. They have a $ prepended to them, so you can trim that off the entire list by mapping substring(1) onto it. That results in a nice clean list of event names.
IMPORTANT NOTE: Normally you should not attempt to externally modify any object member starting with an underscore. Also, expect that any data in it is unstable. The underscore indicates it is for internal use in that object, class or function. Though this object is not stable, it should be up to date enough for us to use it, and we aren't modifying it directly.
The event name is stored in the first entry under packet.data. Just check to see if it is in the list, and raise the alarm if it is not. Now when you send an event from the server the client does not know it will note it in the browser console.
Now you need to save the unhandled messages in a buffer, to play back once the handlers are available again. So to expand on our client-side code from before...
// Save a copy of the onevent function
socket._onevent = socket.onevent;
// Make buffer and configure buffer timings
socket._packetBuffer = [];
socket._packetBufferWaitTime = 1000; // in milliseconds
socket._packetBufferPopDelay = 50; // in milliseconds
function isPacketUnhandled(packet) {
return !Object.keys(socket._callbacks).map(x => x.substr(1)).includes(packet.data[0]);
}
// Define the function that will process the buffer
socket._packetBufferHandler = function(packet) {
if( isPacketUnhandled(packet) ) {
// packet can't be processed yet, restart wait cycle
socket._packetBuffer.push(packet);
console.log(`packet handling not completed, retrying`)
setTimeout(socket._packetBufferHandler, socket._packetBufferWaitTime, socket._packetBuffer.pop());
}
else {
// packet can be processed now, start going through buffer
socket._onevent.apply(socket, Array.prototype.slice.call(arguments));
if(socket._packetBuffer.length > 0) {
setTimeout(socket._packetBufferHandler,socket._packetBufferPopDelay(), socket._packetBuffer.pop());
}
else {
console.log(`all packets in buffer processed`)
socket._packetsWaiting = false;
}
}
}
// Replace the onevent function with a handler that captures all messages
socket.onevent = function (packet) {
// Compare the list of callbacks to the incoming event name
if( isPacketUnhandled(packet) ) {
console.log(`WARNING: Unhandled Event: ${packet.data}`);
socket._packetBuffer.push(packet);
if(!socket._packetsWaiting) {
socket._packetsWaiting = true;
setTimeout(socket._packetBufferHandler, socket._packetBufferWaitTime, socket._packetBuffer.pop());
}
}
socket._onevent.apply(socket, Array.prototype.slice.call(arguments));
};
Here the unhandled packets get pushed into the buffer and a timer is set running. Once the given amount of time has passed, if starts checking to see if the handlers for each item are ready. Each one is handled until all are exhausted or a handler is missing, which trigger another wait.
This can and will stack up unhandled calls until you blow out the client's allotted memory, so make sure that those handlers DO get loaded in a reasonable time span. And take care not to send it anything that will never get handled, because it will keep trying forever.
I tested it with really long strings and it was able to push them through, so what they are calling 'packet' is probably not a standard packet.
Tested with SocketIO version 2.2.0 on Chrome.

Calling socket.disconnect in a forEach loop doesn't actually call disconnect on all sockets

I am new to javascript world. Recently I was working on a chat application in nodejs. So I have a method called gracefulshutdown as follows.
var gracefulShutdown = function() {
logger.info("Received kill signal, shutting down gracefully.");
server.close();
logger.info('Disconnecting all the socket.io clients');
if (Object.keys(io.sockets.sockets).length == 0) process.exit();
var _map = io.sockets.sockets,
_socket;
for (var _k in _map) {
if (_map.hasOwnProperty(_k)) {
_socket = _map[_k];
_socket.disconnect(true);
}
}
...code here...
setTimeout(function() {
logger.error("Could not close connections in time, shutting down");
process.exit();
}, 10 * 1000);
}
Here is what is happening in the disconnect listener.The removeDisconnectedClient method simply updates an entry in the db to indicate the removed client.
socket.on('disconnect', function() {
removeDisconnectedClient(socket);
});
So in this case the disconnect event wasn't fired for all sockets. It was fired for only a few sockets randomly from the array. Although I was able to fix it using setTimeout(fn, 0) with the help of a teammate.
I read about it online and understood only this much that setTimeout defers the execution of of code by adding it to end of event queue. I read about javascript context, call stack, event loop. But I couldn't put together all of it in this context. I really don't understand why and how this issue occurred. Could someone explain it in detail. And what is the best way to solve or avoid them.
It is hard to say for sure without a little more context about the rest of the code in gracefulShutdown but I'm surprised it is disconnecting any of the sockets at all:
_socket = _map[ _k ];
socket.disconnect(true);
It appears that you are assigning an item from _map to the variable _socket but then calling disconnect on socket, which is a different variable. I'm guessing it is a typo and you meant to call disconnect on _socket?
Some of the sockets might be disconnecting for other reasons and the appearance that your loop is disconnecting some but not all the sockets is probably just coincidence.
As far as I can tell from the code you posted, socket should be undefined and you should be getting errors about trying to call the disconnect method on undefined.
From the method name where you use it I can suppose that application exits after attempts to disconnect all sockets. The nature of socket communication is asynchronous, so given you have a decent amount of items in _map it can occur that not all messages with disconnect will be sent before the process exits.
You can increase chances by calling exit after some timeout after disconnecting all sockets. However, why would you manually disconnect? On connection interruption remote sockets will automatically get disconnected...
UPDATE
Socket.io for Node.js doesn't have a callback to know for sure that packet with disconnect command was sent. At least in v0.9. I've debugged that and came to conclusion that without modification of sources it is not possible to catch that moment.
In file "socket.io\lib\transports\websocket\hybi-16.js" a method write is called to send the disconnect packet
WebSocket.prototype.write = function (data) {
...
this.socket.write(buf, 'binary');
...
}
Whereas socket.write is defined in Node.js core transport "nodejs-{your-node-version}-src\core-modules-sources\lib\net.js" as
Socket.prototype.write = function(chunk, encoding, cb)
//cb is a callback to be called on writeRequest complete
However as you see this callback is not provided, so socket.io will not know about the packet having been sent.
At the same time when disconnect() is called for websocket, member disconnected is set to true, and "disconnect" event is broadcasted, indeed. But synchronously. So .on('disconnect' handler on server socket doesn't give and valuable information about whether the packet was sent or not.
Solution
I can make a general conclusion from this. If it is so critical to make sure that all clients are immediately informed (and not wait for a heartbeat timeout or if heartbeat is disabled) then this logic should be implemented manually.
You can send an ordinary message which will mean for the client that server is shutting down and call socket disconnect as soon as the message is received. At the same time server will be able to accept all acknowledgements
Server-side:
var sockets = [];
for (var _k in _map) {
if (_map.hasOwnProperty(_k)) {
sockets.push(_map[_k]);
}
}
sockets.map(function (socket) {
socket.emit('shutdown', function () {
socket.isShutdown = true;
var all = sockets.every(function (skt) {
return skt.isShutdown;
});
if (all) {
//wrap in timeout to let current tick finish before quitting
setTimeout(function () {
process.exit();
});
}
})
})
Clients should behave simply
socket.on('shutdown', function () {
socket.disconnect();
});
Thus we make sure each client has explicitly disconnected. We don't care about server. It will be shutdown shortly.
In the example code it looks like io.sockets.sockets is an Object, however, at least in the library version I am using, it is a mutable array which the socket.io library is free to modify each time you are removing a socket with disconnect(true).
Thus, when you call disconnect(true); if the currently iterated item from index i is removed, this effect like this happens:
var a = [1,2,3,4];
for( var i in a) {
a.splice(i,1); // remove item from array
alert(i);
}
// alerts 0,1
Thus, the disconnect(true) call will ask the socket.io to remove the item from the array - and because you are both holding reference to the same array, the contents of the array are modified during the loop.
The solution is to create a copy of the _map with slice() before the loop:
var _map = io.sockets.sockets.slice(); // copy of the original
It would create a copy of the original array and thus should go through all the items in the array.
The reason why calling setTimeout() would also work is that it would defer the removal of the items from the array, allowing the whole loop iterate without modifying the sockets -Array.
The problem here is that sockjs and socket.io use asynchronous "disconnect" methods. IE. When you call disconnect, it is not immediately terminated. It is just a promise that it WILL be terminated. This has the following effect (assuming 3 sockets)
Your for loop grabs the first socket
The disconnect method is called on the first socket
Your for loop grabs the second socket
The disconnect method is called on the second socket
The disconnect method on the first socket finishes
Your for loop grabs the third socket
The disconnect method is called on the third socket
Program kills itself
Notice, that sockets 2 and 3 haven't necessarily finished yet. This could be for a number of reasons.
Finally, setTimeout(fn, 0) is, as you said, blocking the final call, but it may not be consistent (I haven't dug into this too much). By that I mean, you've set the final termination to be AFTER all your sockets have disconnected. The setTimeout and setInterval methods essentially act more like a queue. Your position in the queue is dictated by the timer you set. Two intervals set for 10s each, where they both run synchronously will cause one to run AFTER the other.
After Socket.io 1.0, the library does not expose you an array of the connected sockets. You can check that io.socket.sockets.length, is not equal to the open socket objects. Your best bet is that you broadcast a 'disconnect' message to all the clients that you want to off, and on.'disconnect' on the client side close the actual WebSocket.

Categories