In the non-blocking event loop of JavaScript, is it safe to read and then alter a variable? What happens if two processes want to change a variable nearly at the same time?
Example A:
Process 1: Get variable A (it is 100)
Process 2: Get variable A (it is 100)
Process 1: Add 1 (it is 101)
Process 2: Add 1 (it is 101)
Result: Variable A is 101 instead of 102
Here is a simplified example, having an Express route. Lets say the route gets called 1000 per second:
let counter = 0;
const getCounter = () => {
return counter;
};
const setCounter = (newValue) => {
counter = newValue;
};
app.get('/counter', (req, res) => {
const currentValue = getCounter();
const newValue = currentValue + 1;
setCounter(newValue);
});
Example B:
What if we do something more complex like Array.findIndex() and then Array.splice()? Could it be that the found index has become outdated because another event-process already altered the array?
Process A findIndex (it is 12000)
Process B findIndex (it is 34000)
Process A splice index 12000
Process B splice index 34000
Result: Process B removed the wrong index, should have removed 33999 instead
const veryLargeArray = [
// ...
];
app.get('/remove', (req, res) => {
const id = req.query.id;
const i = veryLargeArray.findIndex(val => val.id === id);
veryLargeArray.splice(i, 1);
});
Example C:
What if we add an async operation into Example B?
const veryLargeArray = [
// ...
];
app.get('/remove', (req, res) => {
const id = req.query.id;
const i = veryLargeArray.findIndex(val => val.id === id);
someAsyncFunction().then(() => {
veryLargeArray.splice(i, 1);
});
});
This question was kind of hard to find the right words to describe it. Please feel free to update the title.
As per #ThisIsNoZaku's link, Javascript has a 'Run To Completion' principle:
Each message is processed completely before any other message is processed.
This offers some nice properties when reasoning about your program, including the fact that whenever a function runs, it cannot be pre-empted and will run entirely before any other code runs (and can modify data the function manipulates). This differs from C, for instance, where if a function runs in a thread, it may be stopped at any point by the runtime system to run some other code in another thread.
A downside of this model is that if a message takes too long to complete, the web application is unable to process user interactions like click or scroll. The browser mitigates this with the "a script is taking too long to run" dialog. A good practice to follow is to make message processing short and if possible cut down one message into several messages.
Further reading: https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop
So, for:
Example A: This works perfectly fine as a sitecounter.
Example B: This works perfectly fine as well, but if many requests happen at the same time then the last request submitted will be waiting quite some time.
Example C: If another call to \remove is sent before someAsyncFunction finishes, then it is entirely possible that your array will be invalid. The way to resolve this would be to move the index finding into the .then clause of the async function.
IMO, at the cost of latency, this solves a lot of potentially painful concurrency problems. If you must optimise the speed of your requests, then my advice would be to look into different architectures (additional caching, etc).
Related
I would like to implement fd_read from the WASI API by waiting for the user to type some text in an HTML input field, and then continuing with the rest of the WASI calls. I.e., with something like:
fd_read = (fd, iovs, iovsLen, nread) => {
// only care about 'stdin'
if(fd !== STDIN)
return WASI_ERRNO_BADF;
const encoder = new TextEncoder();
const view = new DataView(memory.buffer);
view.setUint32(nread, 0, true);
// create a UInt8Array for each buffer
const buffers = Array.from({ length: iovsLen }, (_, i) => {
const ptr = iovs + i * 8;
const buf = view.getUint32(ptr, true);
const bufLen = view.getUint32(ptr + 4, true);
return new Uint8Array(memory.buffer, buf, bufLen);
});
// get input for each buffer
buffers.forEach(buf => {
const input = waitForUserInput();
buf.set(encoder.encode(input));
view.setUint32(nread, view.getUint32(nread, true) + input.length, true);
});
return WASI_ESUCCESS;
}
The implementation works if the variable input is provided. For example, setting const input = "1\n" passes that string to a scanf call in my C program, and it reads in a value of 1.
However, I'm struggling to "stop" the JavaScript execution while waiting for the input to be provided. I understand that JavaScript is event-driven and can't be "paused" in the traditional sense, but trying to provide the input as a callback/Promise has the problem of the function still executing, causing nothing to get passed to stdin:
buffers.forEach(buf => {
let input;
waitForUserInput().then(value => {
input = value;
});
buf.set(encoder.encode(input));
view.setUint32(nread, view.getUint32(nread, true) + input.length, true);
});
Since input is still waiting to be set, nothing gets encoded in the buffer and stdin just reads a 0.
Is there a way to wait for the input with async/await, or maybe a "hack-y" solution with setTimeout? I know that window.Prompt() would stop the execution, but I want the input to be a part of the page. Looking for vanilla JavaScript solutions.
You want to connect asynchronous JavaScript APIs to synchronous WebAssembly APIs. This is a common problem for which WebAssembly itself doesn't yet have a built-in solution, but there are some at the tooling level. In particular, you might want to take a look at Asyncify - I've written a detailed post on how it helps solve those use-cases and how to use it here: https://web.dev/asyncify/
Particularly for WASI, the post also showcases a demo that connects fd_read and other synchronous operations to async APIs from File System Access. You can find live demo at https://wasi.rreverser.com/ and its code at https://github.com/GoogleChromeLabs/wasi-fs-access.
For example, here is an implementation of the fd_read function you're interested in, that uses async-await to wait for asynchronous API: https://github.com/GoogleChromeLabs/wasi-fs-access/blob/4c2d29fdfe79abb9b48bd44e296c2019f55d0eec/src/bindings.ts#L449-L461
You should be able to adapt same approach, Asyncify tooling and potentially even the same code to your example using setTimeout or input events.
I have a set of API endpoints in Express. One of them receives a request and starts a long running process that blocks other incoming Express requests.
My goal to make this process non-blocking. To understand better inner logic of Node Event Loop and how I can do it properly, I want to replace this long running function with my dummy long running blocking function that would start when I send a request to its endpoint.
I suppose, that different ways of making the dummy function blocking could cause Node manage these blockings differently.
So, my question is - how can I make a basic blocking process as a function that would run infinitely?
You can use node-webworker-threads.
var Worker, i$, x$, spin;
Worker = require('webworker-threads').Worker;
for (i$ = 0; i$ < 5; ++i$) {
x$ = new Worker(fn$);
x$.onmessage = fn1$;
x$.postMessage(Math.ceil(Math.random() * 30));
}
(spin = function(){
return setImmediate(spin);
})();
function fn$(){
var fibo;
fibo = function(n){
if (n > 1) {
return fibo(n - 1) + fibo(n - 2);
} else {
return 1;
}
};
return this.onmessage = function(arg$){
var data;
data = arg$.data;
return postMessage(fibo(data));
};
}
function fn1$(arg$){
var data;
data = arg$.data;
console.log("[" + this.thread.id + "] " + data);
return this.postMessage(Math.ceil(Math.random() * 30));
}
https://github.com/audreyt/node-webworker-threads
So, my question is - how can I make a basic blocking process as a function that would run infinitely?
function block() {
// not sure why you need that though
while(true);
}
I suppose, that different ways of making the dummy function blocking could cause Node manage these blockings differently.
Not really. I can't think of a "special way" to block the engine differently.
My goal to make this process non-blocking.
If it is really that long running you should really offload it to another thread.
There are short cut ways to do a quick fix if its like a one time thing, you can do it using a npm module that would do the job.
But the right way to do it is setting up a common design pattern called 'Work Queues'. You will need to set up a queuing mechanism, like rabbitMq, zeroMq, etc. How it works is, whenever you get a computation heavy task, instead of doing it in the same thread, you send it to the queue with relevant id values. Then a separate node process commonly called a 'worker' process will be listening for new actions on the queue and will process them as they arrive. This is a worker queue pattern and you can read up on it here:
https://www.rabbitmq.com/tutorials/tutorial-one-javascript.html
I would strongly advise you to learn this pattern as you would come across many tasks that would require this kind of mechanism. Also with this in place you can scale both your node servers and your workers independently.
I am not sure what exactly your 'long processing' is, but in general you can approach this kind of problem in two different ways.
Option 1:
Use the webworker-threads module as #serkan pointed out. The usual 'thread' limitations apply in this scenario. You will need to communicate with the Worker in messages.
This method should be preferable only when the logic is too complicated to be broken down into smaller independent problems (explained in option 2). Depending on complexity you should also consider if native code would better serve the purpose.
Option 2:
Break down the problem into smaller problems. Solve a part of the problem, schedule the next part to be executed later, and yield to let NodeJS process other events.
For example, consider the following example for calculating the factorial of a number.
Sync way:
function factorial(inputNum) {
let result = 1;
while(inputNum) {
result = result * inputNum;
inputNum--;
}
return result;
}
Async way:
function factorial(inputNum) {
return new Promise(resolve => {
let result = 1;
const calcFactOneLevel = () => {
result = result * inputNum;
inputNum--;
if(inputNum) {
return process.nextTick(calcFactOneLevel);
}
resolve(result);
}
calcFactOneLevel();
}
}
The code in second example will not block the node process. You can send the response when returned promise resolves.
I have a number of socket.io servers serving on iterative ports (port 4001, 4002, 4003, etc)
I want to connect each socket client to the corresponding servers using a loop:
connectSockets = (sensors) => {
const responses = {};
for (const [idx, sensor] of sensors.entries()) {
const socket = socketIOClient(`${endpointBase}:${port + idx}`);
socket.on(`From::${sensor}`, data => {
responses[sensor] = data
});
}
this.setState({
responses
});
};
When I break inside the loop on:
responses[sensor] = data
I can see the data. It is even getting assigned to the appropriate "responses" property.
However, when I get out of the loop and break in setState I see:
responses = {}
No idea why. Scoping issue of some kind? Or maybe I am confused as to how socket.io works - first time using it. Interestingly when I break on "const socket" i get three iterations as I expect, but when I break on "responses[sensor] = data" I get a lot more than 3 iterations. Anyone have any ideas?
Well I figured it out on my own. As it turns out it was a mis-understanding as to how socket.io works. As t.niese mentions the callback is async.
What i needed to do was set the state within the socket connection since "data" will not necessarily be available when the loop completes:
socket.on(`From::${sensor}`, data => {
responses[sensor] = data
this.setState({responses});
});
Worked perfectly.
It is not as you assumed in your question (or your answer) as scoping issue, but related to when certain pats of the code are execute. If you place some console.log (or breakpoints in your code) it will become obvious:
connectSockets = (sensors) => {
const responses = {};
console.log('before loop')
for (const [idx, sensor] of sensors.entries()) {
const socket = socketIOClient(`${endpointBase}:${port + idx}`);
socket.on(`From::${sensor}`, data => {
console.log(`socket.on callback for: ${sensor}`)
responses[sensor] = data
});
}
console.log('after loop')
this.setState({responses});
};
The output of in the console, or the order in which those breakpoints are reach is:
before loop
after loop
socket.on callback for: ...
...
socket.on callback for: ...
The event handlers registered with .on are called at the time when an even happens, but because JavaScript is not multi threaded, the event handling cannot happen at the same time when you register an event handler with .on.
You can solve that - as you already figured out - by moving the this.setState({responses}) to the event callback.Now you call this.setState multiple times. If this is a problem, depends on your setState function.
responses[sensor] = data is triggered asynchronously at a later pointer in time.
this.setState({responses}) is being called in the same event loop. This is the reason you wouldn't find it populated with anything.
I'm using the npm library jsdiff, which has a function that determines the difference between two strings. This is a synchronous function, but given two large, very different strings, it will take extremely long periods of time to compute.
diff = jsdiff.diffWords(article[revision_comparison.field], content[revision_comparison.comparison]);
This function is called in a stack that handles an request through Express. How can I, for the sake of the user, make the experience more bearable? I think my two options are:
Cancelling the synchronous function somehow.
Cancelling the user request somehow. (But would this keep the function still running?)
Edit: I should note that given two very large and different strings, I want a different logic to take place in the code. Therefore, simply waiting for the process to finish is unnecessary and cumbersome on the load - I definitely don't want it to run for any long period of time.
fork a child process for that specific task, you can even create a queu to limit the number of child process that can be running in a given moment.
Here you have a basic example of a worker that sends the original express req and res to a child that performs heavy sync. operations without blocking the main (master) thread, and once it has finished returns back to the master the outcome.
Worker (Fork Example) :
process.on('message', function(req,res) {
/* > Your jsdiff logic goes here */
//change this for your heavy synchronous :
var input = req.params.input;
var outcome = false;
if(input=='testlongerstring'){outcome = true;}
// Pass results back to parent process :
process.send(req,res,outcome);
});
And from your Master :
var cp = require('child_process');
var child = cp.fork(__dirname+'/worker.js');
child.on('message', function(req,res,outcome) {
// Receive results from child process
console.log('received: ' + outcome);
res.send(outcome); // end response with data
});
You can perfectly send some work to the child along with the req and res like this (from the Master): (imagine app = express)
app.get('/stringCheck/:input',function(req,res){
child.send(req,res);
});
I found this on jsdiff's repository:
All methods above which accept the optional callback method will run in sync mode when that parameter is omitted and in async mode when supplied. This allows for larger diffs without blocking the event loop. This may be passed either directly as the final parameter or as the callback field in the options object.
This means that you should be able to add a callback as the last parameter, making the function asynchronous. It will look something like this:
jsdiff.diffWords(article[x], content[y], function(err, diff) {
//add whatever you need
});
Now, you have several choices:
Return directly to the user and keep the function running in the background.
Set a 2 second timeout (or whatever limit fits your application) using setTimeout as outlined in this
answer.
If you go with option 2, your code should look something like this
jsdiff.diffWords(article[x], content[y], function(err, diff) {
//add whatever you need
return callback(err, diff);
});
//if this was called, it means that the above operation took more than 2000ms (2 seconds)
setTimeout(function() { return callback(); }, 2000);
I am facing this issue for the past 1 week and I am just confused about this.
Keeping it short and simple to explain the problem.
We have an in memory Model which stores values like budget etc.Now when a call is made to the API it has a spent associated with it.
We then check the in memory model and add the spent to the existing spend and then check to the budget and if it exceeds we donot accept any more clicks of that model. for each call we also udpate the db but that is a async operation.
A short example
api.get('/clk/:spent/:id', function(req, res) {
checkbudget(spent, id);
}
checkbudget(spent, id){
var obj = in memory model[id]
obj.spent+= spent;
obj.spent > obj.budjet // if greater.
obj.status = 11 // 11 is the stopped status
update db and rebuild model.
}
This used to work fine but now with concurrent requests we are getting false spends out spends increase more than budget and it stops after some time. We simulated the call with j meter and found this.
As far as we could find node is async so by the time the status is updated to 11 many threads have already updated the spent for the campaign.
How to have a semaphore kind of logic for Node.js so that the variable budget is in sync with the model
update
db.addSpend(campaignId, spent, function(err, data) {
campaign.spent += spent;
var totalSpent = (+camp.spent) + (+camp.cpb);
if (totalSpent > camp.budget) {
logger.info('Stopping it..');
camp.status = 11; // in-memory stop
var History = [];
History.push(some data);
db.stopCamp(campId, function(err, data) {
if (err) {
logger.error('Error while stopping );
}
model.campMAP = buildCatMap(model);
model.campKeyMap = buildKeyMap(model);
db.campEventHistory(cpcHistory, false, function(err) {
if (err) {
logger.error(Error);
}
})
});
}
});
GIST of the code can anyone help now please
Q: Is there semaphore or equivalent in NodeJs?
A: No.
Q: Then how do NodeJs users deal with race condition?
A: In theory you shouldn't have to as there is no thread in javascript.
Before going deeper into my proposed solution I think it is important for you to know how NodeJs works.
For NodeJs it is driven by an event based architecture. This means that in the Node process there is an event queue that contains all the "to-do" events.
When an event gets pop from the queue, node will execute all of the required code until it is finished. Any async calls that were made during the run were spawned as other events and they are queued up in the event queue until a response is heard back and it is time to run them again.
Q: So what can I do to ensure that only 1 request can perform updates to the database at a time?
A: I believe there are many ways you can achieve this but one of the easier way out is to use the set_timeout API.
Example:
api.get('/clk/:spent/:id', function(req, res) {
var data = {
id: id
spending: spent
}
canProceed(data, /*functions to exec after canProceed=*/ checkbudget);
}
var canProceed = function(data, next) {
var model = in memory model[id];
if (model.is_updating) {
set_timeout(isUpdating(data, next), /*try again in=*/1000/*milliseconds*/);
}
else {
// lock is released. Proceed.
next(data.spending, data.id)
}
}
checkbudget(spent, id){
var obj = in memory model[id]
obj.is_updating = true; // Lock this model
obj.spent+= spent;
obj.spent > obj.budjet // if greater.
obj.status = 11 // 11 is the stopped status
update db and rebuild model.
obj.is_updating = false; // Unlock the model
}
Note: What I got here is pseudo code as well so you'll may have to tweak it a bit.
The idea here is to have a flag in your model to indicate whether a HTTP request can proceed to do the critical code path. In this case your checkbudget function and beyond.
When a request comes in it checks the is_updating flag to see if it can proceed. If it is true then it schedules an event, to be fired in a second later, this "setTimeout" basically becomes an event and gets placed into node's event queue for later processing
When this event gets fired later, the checks again. This occurs until the is_update flag becomes false then the request goes on to do its stuff and is_update is set to false again when all the critical code is done.
Not the most efficient way but it gets the job done, you can always revisit the solution when performance becomes a problem.