How to execute socket.io endpoints sychronously - javascript

My application is using socket.io, from what I gather, socket.io executes asynchronously. Most of the time this is not a problem, however there is a particular case where 2 users in my app may call the same socket endpoint at the same time, and this causes issues.
What I would rather have is for each socket endpoint to wait until the one before it finishes executing, before it gets executed. If they run asynchronously, I get unexpected results.
On the server I have the following...
// Establish a connection with a WebSocket.
io.on("connection", socket => {
socket.on("add_song", async (data) => {
PlaylistHandler.add_song(io, socket, data);
});
...
...
add_song gets called at the same time by two different io connections (2 different users). I don't want the function PlaylistHandler.add_song to run in parallel for each so I tried using async/await...
await PlaylistHandler.add_song(io, socket, data);
That didn't solve anything because I suspect it is because there are two different io connections making the call.
Is there any way to make the socket call itself run sequentially rather than in parallel?

await doesn't block the event loop, so it indeed doesn't matter here. Both await PlaylistHandler.add_song will get executed in their respective io.on listeners in parallel.
Your best/easiest shot is to set a variable calculating = true at the beginning of your add_song, and to postpone any add_song if we are already doing one.
Hope this snippet will inspire you into achieving a workable solution:
let calculating = false;
function add_song(io, socket, data){
if(calculating){
setTimeout(function(){
add_song(io, socket, data)
}, 500); //depends on how often you want to check, reduce/increase timeout depending on how time-sensitive checking should be
return;
}
calculating = true;
//Do all your usual add_song processing
//After final operation of add_song
calculating = false;
}

Related

NodeJS: async function with while(true) halts the script

I have a:
socketio server
array of values
async function which iterates the array and deletes them
Here is the code
// import
const SocketIo = require('socket.io')
// setup
const Server = new SocketIo.Server()
const arrayOfValues = ['some', 'values', 'here']
// server welcome message
Server.on('connection', async Socket=>{
iterateValues() // server no longer responds, even if function is async
Socket.emit('welcome', 'hi')
})
// the async function
async function iterateValues(){
while(true){
if( arrayOfValues[0] ) arrayOfValues.splice(0, 1)
}
}
Whenever the client connects to my server, "iterateValues()" must be invoked and asynchronously remove the items from array. The problem is when "iterateValues()" is invoked anywhere, it halts the script, and the server no longer responds.
I think I misunderstood what async functions actually are. How can I create a function that can run an infinite while loop without halting the thread of other methods?
I think I misunderstood what async functions actually are.
I am not a back end guy, but will give it a shot. Nodejs still runs in a single thread. So an infinite loop will block the whole program. Asynchronous doesn't mean that the provided callback runs in a separate thread. If nodejs was multithreaded you could have infinite loop in one thread, that wouldn't block the other one for example.
Asynchronous means that the provided callback won't be run immediately, rather delayed, but when it is run it is still within that one thread.

NodeJS Event Loop Multiple Request

I have a question on the following code below (Source: https://blog.risingstack.com/node-js-at-scale-understanding-node-js-event-loop/):
'use strict'
const express = require('express')
const superagent = require('superagent')
const app = express()
app.get('/', sendWeatherOfRandomCity)
function sendWeatherOfRandomCity (request, response) {
getWeatherOfRandomCity(request, response)
sayHi()
}
const CITIES = [
'london',
'newyork',
'paris',
'budapest',
'warsaw',
'rome',
'madrid',
'moscow',
'beijing',
'capetown',
]
function getWeatherOfRandomCity (request, response) {
const city = CITIES[Math.floor(Math.random() * CITIES.length)]
superagent.get(`wttr.in/${city}`)
.end((err, res) => {
if (err) {
console.log('O snap')
return response.status(500).send('There was an error getting the weather, try looking out the window')
}
const responseText = res.text
response.send(responseText)
console.log('Got the weather')
})
console.log('Fetching the weather, please be patient)
}
function sayHi () {
console.log('Hi')
}
app.listen(3000);
I have these questions:
When the superagent.get(wttr.in/${city}) in the getWeatherOfRandomCity method makes a web request to http://wttr.in/sf for example, that request will be placed on the task queue and not the main call stack correct?
If there are methods on the main call stack (i.e. the main call stack isn't empty), the end event attached to the superagent.get(wttr.in/${city}).end(...) (that will be pushed to the task queue) will not get called until the main call stack is empty correct? In other words, on every tick of the event loop, it will take one item from the task queue?
Let's say two requests to localhost:3000/ come in one after another. The first request will push the sendWeatherOfRandomCity on the stack, the getWeatherOfRandomCity on the stack, then the web request superagent.get(wttr.in/${city}).end(...) will be placed on the background queue, then console.log('Fetching the weather, please be patient'), then the sendWeatherOfRandomCity will pop off the stack, and finally sayHi() will be pushed on the stack and it will print "Hi" and pop off the stack and finally, the end event attached to superagent.get(wttr.in/${city}).end(...) will be called from the task queue since the main call stack will be empty. Now when the second request comes, it will push all of those same things as the first request onto the main call stack but will the end handler from the first request (still in the task queue) run first or the stuff pushed onto the main call stack by the second web request will run first?
when you make http request, libuv sees that you are attempting to make a network request. Neither libuv nor node has any code to handle all of these operations that are involved with a network request. Instead libuv delegates the request making to the underlying operating system.
It's actually the kernel which is the essential part of our operating system that does the real network request work. Libuv is used to issue the request and then it just waits on the operating system to emit a signal that some response has come back to the request. So because Libuv is delegating the work done to the operating system the operating system itself decides whether to make a new threat or not. Or just generally how to handle the entire process of making the request. Each different os has a different method to handle this: On linux it is epoll, in mac os it is called kqueue, and in windows it is called GetQueuedCompletionStatusEx.
there are 6 phases in event loop and one of them is i/o poll. And every phase has precedence over others. number 1 is always timers. When time is up,(or event is done) timer's callback will be called to event queue and timer function are gonna move to event queue as well. Then event loop will check if it call stack is available. Call stack is where functions go to execute. You can do one thing at a time and the call stack enforces that we can only have one function on the top of the call stack that is the thing we're doing.There's no way to execute two things at the same time in JAVASCRIPT RUNTIME.
if the call stack is empty means main() function is removed, event loop will push the timer function to call stack and your function will be executed. None of the Async callbacks are EVER going to run before the main function is done.
So when it is time for i/o poll phase which handles incoming data and connections, with the same path how timer function followed, your function with handles the fetching will be executed.

NodeJS http and extremely large response bodies

At the moment, I'm trying to request a very large JSON object from an API (particularly this one) which, depending on various factors, can be upwards of a few MB. The problem is, however, is that NodeJS takes forever to do anything and then just runs out of memory: the first line of my response callback doesn't ever execute.
I could request each item individually, but that is a tremendous amount of requests. To quote the a dev behind the new API:
Until now, if you wanted to get all the market orders for Tranquility you had to request every type per region individually. That would generally be 50+ regions multiplied by upwards of 13,000 types. Even if it was just 13,000 types and 50 regions, that is 650,000 requests required to get all the market information. And if you wanted to get all the data in the 5-minute cache window, it would require almost 2,200 requests per second.
Obviously, that is not a great idea.
I'm trying to get the array items into redis for use later, then follow the next url and repeat until the last page is reached. Is there any way to do this?
EDIT:
Here's the problem code. Visiting the URL works fine in-browser.
// ...
REGIONS.forEach((region) => {
LOG.info(' * Grabbing data for `' + region.name + '#' + region.id + '`');
var href = url + region.id + '/orders/all/', next = href;
var page = 1;
while (!!next) {
https.get(next, (res) => {
LOG.info(' * * Page ' + page++ + ' responded with ' + res.statusCode);
// ...
The first LOG.info line executes, while the second does not.
It appears that you are doing a while(!!next) loop which is the cause of your problem. If you show more of the server code, we could advise more precisely and even suggest a better way to code it.
Javascript run your code single threaded. That means one thread of execution runs to completion before any other events can be run.
So, if you do:
while(!!next) {
https.get(..., (res) => {
// hoping this will run
});
}
Then, your callback to http.get() will never get called. Your while loop just keeps running forever. As long as it is running, the callback from the https.get() can never get called. That request is likely long since completed and there's an event sitting in the internal JS event queue to call the callback, but until your while() loop finished, that event can't get called. So you have a deadlock. The while() loop is waiting for something else to run to change it's condition, but nothing else can run until the while() loop is done.
There are several other ways to do serial async iterations. In general, you can't use .forEach() or while().
Here are several schemes for async looping:
Node.js: How do you handle callbacks in a loop?
While loop with jQuery async AJAX calls
How to synchronize a sequence of promises?
How to use after and each in conjunction to create a synchronous loop in underscore js
Or, the async library which you mentioned also has functions for doing async looping.
First of all, a few MBs of json payload is not exactly huge. So the route handler code might require some close scrutiny.
However, to actually deal with huge amounts of JSON, you can consume your request as a stream. JSONStream (along with many other similar libraries) allow you to do this in a memory efficient way. You can specify the paths you need to process using JSONPath (XPath analog for JSON) and then subscribe to the stream for matching data sets.
Following example from the README of JSONStream illustrates this succinctly:
var request = require('request')
, JSONStream = require('JSONStream')
, es = require('event-stream')
request({url: 'http://isaacs.couchone.com/registry/_all_docs'})
.pipe(JSONStream.parse('rows.*'))
.pipe(es.mapSync(function (data) {
console.error(data)
return data
}))
Use the stream functionality of the request module to process large amounts of incoming data. As data comes through the stream, parse it to a chunk of data that can be worked with, push that data through the pipe, and pull in the next chunk of data.
You might create a transform stream to manipulate a chunk of data that has been parsed and a write stream to store the chunk of data.
For example:
var stream = request ({ url: your_url }).pipe(parseStream)
.pipe(transformStream)
.pipe (writeStream);
stream.on('finish', () => {
setImmediate (() => process.exit(0));
});
Try for info on creating streams https://bl.ocks.org/joyrexus/10026630

node.js async request with timeout?

Is it possible, in node.js, to make an asynchronous call that times out if it takes too long (or doesn't complete) and triggers a default callback?
The details:
I have a node.js server that receives a request and then makes multiple requests asynchronously behind the scenes, before responding. The basic issue is covered by an existing question, but some of these calls are considered 'nice to have'. What I mean is that if we get the response back, then it enhances the response to the client, but if they take too long to respond it is better to respond to the client in a timely manner than with those responses.
At the same time this approach would allow to protect against services that simply aren't completing or failing, while allowing the main thread of operation to respond.
You can think of this in the same way as a Google search that has one core set of results, but provides extra responses based on other behind the scenes queries.
If its simple just use setTimout
app.get('/', function (req, res) {
var result = {};
// populate object
http.get('http://www.google.com/index.html', (res) => {
result.property = response;
return res.send(result);
});
// if we havent returned within a second, return without data
setTimeout(function(){
return res.send(result);
}, 1000);
});
Edit: as mentioned by peteb i forgot to check to see if we already sent. This can be accomplished by using res.headerSent or by maintaining a 'sent' value yourself. I also noticed res variable was being reassigned
app.get('/', function (req, res) {
var result = {};
// populate object
http.get('http://www.google.com/index.html', (httpResponse) => {
result.property = httpResponse;
if(!res.headersSent){
res.send(result);
}
});
// if we havent returned within a second, return without data
setTimeout(function(){
if(!res.headersSent){
res.send(result);
}
}, 1000);
});
Check this example of timeout callback https://github.com/jakubknejzlik/node-timeout-callback/blob/master/index.js
You could modify it to do action if time's out or just simply catch error.
You can try using a timeout. For example using the setTimeout() method:
Setup a timeout handler: var timeOutX = setTimeout(function…
Set that variable to null: timeOutX = NULL (to indicate that the timeout has been fired)
Then execute your callback function with one argument (error handling): callback({error:'The async request timed out'});
You add the time for your timeout function, for example 3 seconds
Something like this:
var timeoutX = setTimeout(function() {
timeOutX = null;
yourCallbackFunction({error:'The async request timed out'});
}, 3000);
With that set, you can then call your async function and you put a timeout check to make sure that your timeout handler didn’t fire yet.
Finally, before you run your callback function, you must clear that scheduled timeout handler using the clearTimeout() method.
Something like this:
yourAsyncFunction(yourArguments, function() {
if (timeOutX) {
clearTimeout(timeOutX);
yourCallbackFunction();
}
});

NodeJS promise blocking requests

I am quite confused about why is my promise blocking the node app requests.
Here is my simplified code:
var express = require('express');
var someModule = require('somemodule');
app = express();
app.get('/', function (req, res) {
res.status(200).send('Main');
});
app.get('/status', function (req, res) {
res.status(200).send('Status');
});
// Init Promise
someModule.doSomething({}).then(function(){},function(){}, function(progress){
console.log(progress);
});
var server = app.listen(3000, function () {
var host = server.address().address;
var port = server.address().port;
console.log('Example app listening at http://%s:%s in %s environment',host, port, app.get('env'));
});
And the module:
var q = require('q');
function SomeModule(){
this.doSomething = function(){
return q.Promise(function(resolve, reject, notify){
for (var i=0;i<10000;i++){
notify('Progress '+i);
}
resolve();
});
}
}
module.exports = SomeModule;
Obviously this is very simplified. The promise function does some work that takes anywhere from 5 to 30 minutes and has to run only when server starts up.
There is NO async operation in that promise function. Its just a lot of data processing, loops etc.
I wont to be able to do requests right away though. So what I expect is when I run the server, I can go right away to 127.0.0.1:3000 and see Main and same for any other requests.
Eventually I want to see the progress of that task by accessing /status but Im sure I can make that work once the server works as expected.
At the moment, when I open / it just hangs until the promise job finishes..
Obviously im doing something wrong...
If your task is IO-bound go with process.nextTick. If your task is CPU-bound asynchronous calls won't offer much performance-wise. In that case you need to delegate the task to another process. An example solution would be to spawn a child process, do the work and pipe the results back to the parent process when done.
See nodejs.org/api/child_process.html for more.
If your application needs to do this often then forking lots of child processes quickly becomes a resource hog - each time you fork, a new V8 process will be loaded into memory. In this case it is probably better to use one of the multiprocessing modules like Node's own Cluster. This module offers easy creation and communication between master-worker processes and can remove a lot of complexity from your code.
See also a related question: Node.js - Sending a big object to child_process is slow
The main thread of Javascript in node.js is single threaded. So, if you do some giant loop that is processor bound, then that will hog the one thread and no other JS will run in node.js until that one operation is done.
So, when you call:
someModule.doSomething()
and that is all synchronous, then it does not return until it is done executing and thus the lines of code following that don't execute until the doSomething() method returns. And, just so you understand, the use of promises with synchronous CPU-hogging code does not help your cause at all. If it's synchronous and CPU bound, it's just going to take a long time to run before anything else can run.
If there is I/O involves in the loop (like disk I/O or network I/O), then there are opportunities to use async I/O operations and make the code non-blocking. But, if not and it's just a lot of CPU stuff, then it will block until done and no other code will run.
Your opportunities for changing this are:
Run the CPU consuming code in another process. Either create a separate program that you run as a child process that you can pass input to and get output from or create a separate server that you can then make async requests to.
Break the non-blocking work into chunks where you execute 100ms chunks of work at a time, then yield the processor back to the event loop (using something like setTimeout() to allow other things in the event queue to be serviced and run before you pick up and run the next chunk of work. You can see Best way to iterate over an array without blocking the UI for ideas on how to chunk synchronous work.
As an example, you could chunk your current loop. This runs up to 100ms of cycles and then breaks execution to give other things a chance to run. You can set the cycle time to whatever you want.
function SomeModule(){
this.doSomething = function(){
return q.Promise(function(resolve, reject, notify){
var cntr = 0, numIterations = 10000, timePerSlice = 100;
function run() {
if (cntr < numIterations) {
var start = Date.now();
while (Date.now() - start < timePerSlice && cntr < numIterations) {
notify('Progress '+cntr);
++cntr;
}
// give some other things a chance to run and then call us again
// setImmediate() is also an option here, but setTimeout() gives all
// other operations a chance to run alongside this operation
setTimeout(run, 10);
} else {
resolve();
}
}
run();
});
}
}

Categories