Related
I have been reading up on nodejs lately, trying to understand how it handles multiple concurrent requests. I know NodeJs is a single threaded event loop based architecture, and at a given point in time only one statement is going to be executing, i.e. on the main thread and that blocking code/IO calls are handled by the worker threads (default is 4).
Now my question is, what happens when a web server built using NodeJs receives multiple requests? I know that there are lots of similar questions here, but haven't found a concrete answer to my question.
So as an example, let's say we have following code inside a route like /index:
app.use('/index', function(req, res, next) {
console.log("hello index routes was invoked");
readImage("path", function(err, content) {
status = "Success";
if(err) {
console.log("err :", err);
status = "Error"
}
else {
console.log("Image read");
}
return res.send({ status: status });
});
var a = 4, b = 5;
console.log("sum =", a + b);
});
Let's assume that the readImage() function takes around 1 min to read that Image.
If two requests, T1, and T2 come in concurrently, how is NodeJs going to process these request ?
Does it going to take first request T1, process it while queueing the request T2? I assume that if any async/blocking stuff is encountered like readImage, it then sends that to a worker thread (then some point later when async stuff is done that thread notifies the main thread and main thread starts executing the callback?), and continues by executing the next line of code?
When it is done with T1, it then processes the T2 request? Is that correct? Or it can process T2 in between (meaning whilethe code for readImage is running, it can start processing T2)?
Is that right?
Your confusion might be coming from not focusing on the event loop enough. Clearly you have an idea of how this works, but maybe you do not have the full picture yet.
Part 1, Event Loop Basics
When you call the use method, what happens behind the scenes is another thread is created to listen for connections.
However, when a request comes in, because we're in a different thread than the V8 engine (and cannot directly invoke the route function), a serialized call to the function is appended onto the shared event loop, for it to be called later. ('event loop' is a poor name in this context, as it operates more like a queue or stack)
At the end of the JavaScript file, the V8 engine will check if there are any running theads or messages in the event loop. If there are none, it will exit with a code of 0 (this is why server code keeps the process running). So the first Timing nuance to understand is that no request will be processed until the synchronous end of the JavaScript file is reached.
If the event loop was appended to while the process was starting up, each function call on the event loop will be handled one by one, in its entirety, synchronously.
For simplicity, let me break down your example into something more expressive.
function callback() {
setTimeout(function inner() {
console.log('hello inner!');
}, 0); // †
console.log('hello callback!');
}
setTimeout(callback, 0);
setTimeout(callback, 0);
† setTimeout with a time of 0, is a quick and easy way to put something on the event loop without any timer complications, since no matter what, it has always been at least 0ms.
In this example, the output will always be:
hello callback!
hello callback!
hello inner!
hello inner!
Both serialized calls to callback are appended to the event loop before either of them is called. This is guaranteed. That happens because nothing can be invoked from the event loop until after the full synchronous execution of the file.
It can be helpful to think of the execution of your file, as the first thing on the event loop. Because each invocation from the event loop can only happen in series, it becomes a logical consequence, that no other event loop invocation can occur during its execution; Only when the previous invocation is finished, can the next event loop function be invoked.
Part 2, The inner Callback
The same logic applies to the inner callback as well, and can be used to explain why the program will never output:
hello callback!
hello inner!
hello callback!
hello inner!
Like you might expect.
By the end of the execution of the file, two serialized function calls will be on the event loop, both for callback. As the Event loop is a FIFO (first in, first out), the setTimeout that came first, will be be invoked first.
The first thing callback does is perform another setTimeout. As before, this will append a serialized call, this time to the inner function, to the event loop. setTimeout immediately returns, and execution will move on to the first console.log.
At this time, the event loop looks like this:
1 [callback] (executing)
2 [callback] (next in line)
3 [inner] (just added by callback)
The return of callback is the signal for the event loop to remove that invocation from itself. This leaves 2 things in the event loop now: 1 more call to callback, and 1 call to inner.
Now callback is the next function in line, so it will be invoked next. The process repeats itself. A call to inner is appended to the event loop. A console.log prints Hello Callback! and we finish by removing this invocation of callback from the event loop.
This leaves the event loop with 2 more functions:
1 [inner] (next in line)
2 [inner] (added by most recent callback)
Neither of these functions mess with the event loop any further. They execute one after the other, the second one waiting for the first one's return. Then when the second one returns, the event loop is left empty. This fact, combined with the fact that there are no other threads currently running, triggers the end of the process, which exits with a return code of 0.
Part 3, Relating to the Original Example
The first thing that happens in your example, is that a thread is created within the process which will create a server bound to a particular port. Note, this is happening in precompiled C++ code, not JavaScript, and is not a separate process, it's a thread within the same process. see: C++ Thread Tutorial.
So now, whenever a request comes in, the execution of your original code won't be disturbed. Instead, incoming connection requests will be opened, held onto, and appended to the event loop.
The use function, is the gateway into catching the events for incoming requests. Its an abstraction layer, but for the sake of simplicity, it's helpful to think of the use function like you would a setTimeout. Except, instead of waiting a set amount of time, it appends the callback to the event loop upon incoming http requests.
So, let's assume that there are two requests coming in to the server: T1 and T2. In your question you say they come in concurrently, since this is technically impossible, I'm going to assume they are one after the other, with a negligible time in between them.
Whichever request comes in first, will be handled first by the secondary thread from earlier. Once that connection has been opened, it's appended to the event loop, and we move on to the next request, and repeat.
At any point after the first request is added to the event loop, V8 can begin execution of the use callback.
A quick aside about readImage
Since its unclear whether readImage is from a particular library, something you wrote or otherwise, it's impossible to tell exactly what it will do in this case. There are only 2 possibilities though, so here they are:
It's entirely synchronous, never using an alternate thread or the event loop
function readImage (path, callback) {
let image = fs.readFileSync(path);
callback(null, image);
// a definition like this will force the callback to
// fully return before readImage returns. This means
// means readImage will block any subsequent calls.
}
It's entirely asynchronous, and takes advantage of fs' async callback.
function readImage (path, callback) {
fs.readFile(path, (err, data) => {
callback(err, data);
});
// a definition like this will force the readImage
// to immediately return, and allow exectution
// to continue.
}
For the purposes of explanation, I'll be operating under the assumption that readImage will immediately return, as proper asynchronous functions should.
Once the use callback execution is started, the following will happen:
The first console log will print.
readImage will kick off a worker thread and immediately return.
The second console log will print.
During all of this, its important to note, these operations are happening synchronously; No other event loop invocation can start until these are finished. readImage may be asynchronous, but calling it is not, the callback and usage of a worker thread is what makes it asynchronous.
After this use callback returns, the next request has probably already finished parsing and was added to the event loop, while V8 was busy doing our console logs and readImage call.
So the next use callback is invoked, and repeats the same process: log, kick off a readImage thread, log again, return.
After this point, the readImage functions (depending on how long they take) have probably already retrieved what they needed and appended their callback to the event loop. So they will get executed next, in order of whichever one retrieved its data first. Remember, these operations were happening in separate threads, so they happened not only in parallel to the main javascript thread, but also parallel to each other, so here, it doesn't matter which one got called first, it matters which one finished first, and got 'dibs' on the event loop.
Whichever readImage completed first will be the first one to execute. So, assuming no errors occured, we'll print out to the console, then write to the response for the corresponding request, held in lexical scope.
When that send returns, the next readImage callback will begin execution: console log, and writing to the response.
At this point, both readImage threads have died, and the event loop is empty, but the thread that holds the server port binding is keeping the process alive, waiting for something else to add to the event loop, and the cycle to continue.
I hope this helps you understand the mechanics behind the asynchronous nature of the example you provided.
For each incoming request, node will handle it one by one. That means there must be order, just like the queue, first in first serve. When node starts processing request, all synchronous code will execute, and asynchronous will pass to work thread, so node can start to process the next request. When the asynchrous part is done, it will go back to main thread and keep going.
So when your synchronous code takes too long, you block the main thread, node won't be able to handle other request, it's easy to test.
app.use('/index', function(req, res, next) {
// synchronous part
console.log("hello index routes was invoked");
var sum = 0;
// useless heavy task to keep running and block the main thread
for (var i = 0; i < 100000000000000000; i++) {
sum += i;
}
// asynchronous part, pass to work thread
readImage("path", function(err, content) {
// when work thread finishes, add this to the end of the event loop and wait to be processed by main thread
status = "Success";
if(err) {
console.log("err :", err);
status = "Error"
}
else {
console.log("Image read");
}
return res.send({ status: status });
});
// continue synchronous part at the same time.
var a = 4, b = 5;
console.log("sum =", a + b);
});
Node won't start processing the next request until finish all synchronous part. So people said don't block the main thread.
There are a number of articles that explain this such as this one
The long and the short of it is that nodejs is not really a single threaded application, its an illusion. The diagram at the top of the above link explains it reasonably well, however as a summary
NodeJS event-loop runs in a single thread
When it gets a request, it hands that request off to a new thread
So, in your code, your running application will have a PID of 1 for example. When you get request T1 it creates PID 2 that processes that request (taking 1 minute). While thats running you get request T2 which spawns PID 3 also taking 1 minute. Both PID 2 and 3 will end after their task is completed, however PID 1 will continue listening and handing off events as and when they come in.
In summary, NodeJS being 'single threaded' is true, however its just an event-loop listener. When events are heard (requests), it passes them off to a pool of threads that execute asynchronously, meaning its not blocking other requests.
You can simply create child process by shifting readImage() function in a different file using fork().
The parent file, parent.js:
const { fork } = require('child_process');
const forked = fork('child.js');
forked.on('message', (msg) => {
console.log('Message from child', msg);
});
forked.send({ hello: 'world' });
The child file, child.js:
process.on('message', (msg) => {
console.log('Message from parent:', msg);
});
let counter = 0;
setInterval(() => {
process.send({ counter: counter++ });
}, 1000);
Above article might be useful to you .
In the parent file above, we fork child.js (which will execute the file with the node command) and then we listen for the message event. The message event will be emitted whenever the child uses process.send, which we’re doing every second.
To pass down messages from the parent to the child, we can execute the send function on the forked object itself, and then, in the child script, we can listen to the message event on the global process object.
When executing the parent.js file above, it’ll first send down the { hello: 'world' } object to be printed by the forked child process and then the forked child process will send an incremented counter value every second to be printed by the parent process.
The V8 JS interpeter (ie: Node) is basically single threaded. But, the processes it kicks off can be async, example: 'fs.readFile'.
As the express server runs, it will open new processes as it needs to complete the requests. So the 'readImage' function will be kicked off (usually asynchronously) meaning that they will return in any order. However the server will manage which response goes to which request automatically.
So you will NOT have to manage which readImage response goes to which request.
So basically, T1 and T2, will not return concurrently, this is virtually impossible. They are both heavily reliant on the Filesystem to complete the 'read' and they may finish in ANY ORDER (this cannot be predicted). Note that processes are handled by the OS layer and are by nature multithreaded (in a modern computer).
If you are looking for a queue system, it should not be too hard to implement/ensure that images are read/returned in the exact order that they are requested.
Since there's not really more to add to the previous answer from Marcus - here's a graphic that explains the single threaded event-loop mechanism:
For the past two days I have been working with chrome asynchronous storage. It works "fine" if you have a function. (Like Below):
chrome.storage.sync.get({"disableautoplay": true}, function(e){
console.log(e.disableautoplay);
});
My problem is that I can't use a function with what I'm doing. I want to just return it, like LocalStorage can. Something like:
var a = chrome.storage.sync.get({"disableautoplay": true});
or
var a = chrome.storage.sync.get({"disableautoplay": true}, function(e){
return e.disableautoplay;
});
I've tried a million combinations, even setting a public variable and setting that:
var a;
window.onload = function(){
chrome.storage.sync.get({"disableautoplay": true}, function(e){
a = e.disableautoplay;
});
}
Nothing works. It all returns undefined unless the code referencing it is inside the function of the get, and that's useless to me. I just want to be able to return a value as a variable.
Is this even possible?
EDIT: This question is not a duplicate, please allow me to explain why:
1: There are no other posts asking this specifically (I spent two days looking first, just in case).
2: My question is still not answered. Yes, Chrome Storage is asynchronous, and yes, it does not return a value. That's the problem. I'll elaborate below...
I need to be able to get a stored value outside of the chrome.storage.sync.get function. I -cannot- use localStorage, as it is url specific, and the same values cannot be accessed from both the browser_action page of the chrome extension, and the background.js. I cannot store a value with one script and access it with another. They're treated separately.
So my only solution is to use Chrome Storage. There must be some way to get the value of a stored item and reference it outside the get function. I need to check it in an if statement.
Just like how localStorage can do
if(localStorage.getItem("disableautoplay") == true);
There has to be some way to do something along the lines of
if(chrome.storage.sync.get("disableautoplay") == true);
I realize it's not going to be THAT simple, but that's the best way I can explain it.
Every post I see says to do it this way:
chrome.storage.sync.get({"disableautoplay": true, function(i){
console.log(i.disableautoplay);
//But the info is worthless to me inside this function.
});
//I need it outside this function.
Here's a tailored answer to your question. It will still be 90% long explanation why you can't get around async, but bear with me — it will help you in general. I promise there is something pertinent to chrome.storage in the end.
Before we even begin, I will reiterate canonical links for this:
After calling chrome.tabs.query, the results are not available
(Chrome specific, excellent answer by RobW, probably easiest to understand)
Why is my variable unaltered after I modify it inside of a function? - Asynchronous code reference (General canonical reference on what you're asking for)
How do I return the response from an asynchronous call?
(an older but no less respected canonical question on asynchronous JS)
You Don't Know JS: Async & Performance (ebook on JS asynchronicity)
So, let's discuss JS asynchonicity.
Section 1: What is it?
First concept to cover is runtime environment. JavaScript is, in a way, embedded in another program that controls its execution flow - in this case, Chrome. All events that happen (timers, clicks, etc.) come from the runtime environment. JavaScript code registers handlers for events, which are remembered by the runtime and are called as appropriate.
Second, it's important to understand that JavaScript is single-threaded. There is a single event loop maintained by the runtime environment; if there is some other code executing when an event happens, that event is put into a queue to be processed when the current code terminates.
Take a look at this code:
var clicks = 0;
someCode();
element.addEventListener("click", function(e) {
console.log("Oh hey, I'm clicked!");
clicks += 1;
});
someMoreCode();
So, what is happening here? As this code executes, when the execution reaches .addEventListener, the following happens: the runtime environment is notified that when the event happens (element is clicked), it should call the handler function.
It's important to understand (though in this particular case it's fairly obvious) that the function is not run at this point. It will only run later, when that event happens. The execution continues as soon as the runtime acknowledges 'I will run (or "call back", hence the name "callback") this when that happens.' If someMoreCode() tries to access clicks, it will be 0, not 1.
This is what called asynchronicity, as this is something that will happen outside the current execution flow.
Section 2: Why is it needed, or why synchronous APIs are dying out?
Now, an important consideration. Suppose that someMoreCode() is actually a very long-running piece of code. What will happen if a click event happened while it's still running?
JavaScript has no concept of interrupts. Runtime will see that there is code executing, and will put the event handler call into the queue. The handler will not execute before someMoreCode() finishes completely.
While a click event handler is extreme in the sense that the click is not guaranteed to occur, this explains why you cannot wait for the result of an asynchronous operation. Here's an example that won't work:
element.addEventListener("click", function(e) {
console.log("Oh hey, I'm clicked!");
clicks += 1;
});
while(1) {
if(clicks > 0) {
console.log("Oh, hey, we clicked indeed!");
break;
}
}
You can click to your heart's content, but the code that would increment clicks is patiently waiting for the (non-terminating) loop to terminate. Oops.
Note that this piece of code doesn't only freeze this piece of code: every single event is no longer handled while we wait, because there is only one event queue / thread. There is only one way in JavaScript to let other handlers do their job: terminate current code, and let the runtime know what to call when something we want occurs.
This is why asynchronous treatment is applied to another class of calls that:
require the runtime, and not JS, to do something (disk/network access for example)
are guaranteed to terminate (whether in success or failure)
Let's go with a classic example: AJAX calls. Suppose we want to load a file from a URL.
Let's say that on our current connection, the runtime can request, download, and process the file in the form that can be used in JS in 100ms.
On another connection, that's kinda worse, it would take 500ms.
And sometimes the connection is really bad, so runtime will wait for 1000ms and give up with a timeout.
If we were to wait until this completes, we would have a variable, unpredictable, and relatively long delay. Because of how JS waiting works, all other handlers (e.g. UI) would not do their job for this delay, leading to a frozen page.
Sounds familiar? Yes, that's exactly how synchronous XMLHttpRequest works. Instead of a while(1) loop in JS code, it essentially happens in the runtime code - since JavaScript cannot let other code execute while it's waiting.
Yes, this allows for a familiar form of code:
var file = get("http://example.com/cat_video.mp4");
But at a terrible, terrible cost of everything freezing. A cost so terrible that, in fact, the modern browsers consider this deprecated. Here's a discussion on the topic on MDN.
Now let's look at localStorage. It matches the description of "terminating call to the runtime", and yet it is synchronous. Why?
To put it simply: historical reasons (it's a very old specification).
While it's certainly more predictable than a network request, localStorage still needs the following chain:
JS code <-> Runtime <-> Storage DB <-> Cache <-> File storage on disk
It's a complex chain of events, and the whole JS engine needs to be paused for it. This leads to what is considered unacceptable performance.
Now, Chrome APIs are, from ground up, designed for performance. You can still see some synchronous calls in older APIs like chrome.extension, and there are calls that are handled in JS (and therefore make sense as synchronous) but chrome.storage is (relatively) new.
As such, it embraces the paradigm "I acknowledge your call and will be back with results, now do something useful meanwhile" if there's a delay involved with doing something with runtime. There are no synchronous versions of those calls, unlike XMLHttpRequest.
Quoting the docs:
It's [chrome.storage] asynchronous with bulk read and write operations, and therefore faster than the blocking and serial localStorage API.
Section 3: How to embrace asynchronicity?
The classic way to deal with asynchronicity are callback chains.
Suppose you have the following synchronous code:
var result = doSomething();
doSomethingElse(result);
Suppose that, now, doSomething is asynchronous. Then this becomes:
doSomething(function(result) {
doSomethingElse(result);
});
But what if it's even more complex? Say it was:
function doABunchOfThings() {
var intermediate = doSomething();
return doSomethingElse(intermediate);
}
if (doABunchOfThings() == 42) {
andNowForSomethingCompletelyDifferent()
}
Well.. In this case you need to move all this in the callback. return must become a call instead.
function doABunchOfThings(callback) {
doSomething(function(intermediate) {
callback(doSomethingElse(intermediate));
});
}
doABunchOfThings(function(result) {
if (result == 42) {
andNowForSomethingCompletelyDifferent();
}
});
Here you have a chain of callbacks: doABunchOfThings calls doSomething immediately, which terminates, but sometime later calls doSomethingElse, the result of which is fed to if through another callback.
Obviously, the layering of this can get messy. Well, nobody said that JavaScript is a good language.. Welcome to Callback Hell.
There are tools to make it more manageable, for example Promises and async/await. I will not discuss them here (running out of space), but they do not change the fundamental "this code will only run later" part.
Section TL;DR: I absolutely must have the storage synchronous, halp!
Sometimes there are legitimate reasons to have a synchronous storage. For instance, webRequest API blocking calls can't wait. Or Callback Hell is going to cost you dearly.
What you can do is have a synchronous cache of the asynchronous chrome.storage. It comes with some costs, but it's not impossible.
Consider:
var storageCache = {};
chrome.storage.sync.get(null, function(data) {
storageCache = data;
// Now you have a synchronous snapshot!
});
// Not HERE, though, not until "inner" code runs
If you can put ALL your initialization code in one function init(), then you have this:
var storageCache = {};
chrome.storage.sync.get(null, function(data) {
storageCache = data;
init(); // All your code is contained here, or executes later that this
});
By the time code in init() executes, and afterwards when any event that was assigned handlers in init() happens, storageCache will be populated. You have reduced the asynchronicity to ONE callback.
Of course, this is only a snapshot of what storage looks at the time of executing get(). If you want to maintain coherency with storage, you need to set up updates to storageCache via chrome.storage.onChanged events. Because of the single-event-loop nature of JS, this means the cache will only be updated while your code doesn't run, but in many cases that's acceptable.
Similarly, if you want to propagate changes to storageCache to the real storage, just setting storageCache['key'] is not enough. You would need to write a set(key, value) shim that BOTH writes to storageCache and schedules an (asynchronous) chrome.storage.sync.set.
Implementing those is left as an exercise.
Make the main function "async" and make a "Promise" in it :)
async function mainFuction() {
var p = new Promise(function(resolve, reject){
chrome.storage.sync.get({"disableautoplay": true}, function(options){
resolve(options.disableautoplay);
})
});
const configOut = await p;
console.log(configOut);
}
Yes, you can achieve that using promise:
let getFromStorage = keys => new Promise((resolve, reject) =>
chrome.storage.sync.get(...keys, result => resolve(result)));
chrome.storage.sync.get has no returned values, which explains why you would get undefined when calling something like
var a = chrome.storage.sync.get({"disableautoplay": true});
chrome.storage.sync.get is also an asynchronous method, which explains why in the following code a would be undefined unless you access it inside the callback function.
var a;
window.onload = function(){
chrome.storage.sync.get({"disableautoplay": true}, function(e){
// #2
a = e.disableautoplay; // true or false
});
// #1
a; // undefined
}
If you could manage to work this out you will have made a source of strange bugs. Messages are executed asynchronously which means that when you send a message the rest of your code can execute before the asychronous function returns. There is not guarantee for that since chrome is multi-threaded and the get function may delay, i.e. hdd is busy.
Using your code as an example:
var a;
window.onload = function(){
chrome.storage.sync.get({"disableautoplay": true}, function(e){
a = e.disableautoplay;
});
}
if(a)
console.log("true!");
else
console.log("false! Maybe undefined as well. Strange if you know that a is true, right?");
So it will be better if you use something like this:
chrome.storage.sync.get({"disableautoplay": true}, function(e){
a = e.disableautoplay;
if(a)
console.log("true!");
else
console.log("false! But maybe undefined as well");
});
If you really want to return this value then use the javascript storage API. This stores only string values so you have to cast the value before storing and after getting it.
//Setting the value
localStorage.setItem('disableautoplay', JSON.stringify(true));
//Getting the value
var a = JSON.stringify(localStorage.getItem('disableautoplay'));
var a = await chrome.storage.sync.get({"disableautoplay": true});
This should be in an async function. e.g. if you need to run it at top level, wrap it:
(async () => {
var a = await chrome.storage.sync.get({"disableautoplay": true});
})();
I have a UI where I need animations to run smoothly. Every so often, I need to do a semi-large data calculation that makes the animation skip until it is this calculation is completed.
I am trying to get around this by making the data calculation async with setTimeout. Something like setTimeout(calcData(), 0);
The whole code is something like this (simplified):
while (animating) {
performAnimation();
if (needCalc) {
setTimeout(calcData(), 0);
}
}
But I still get a skip in the animation. It runs smoothly when I do not need to do any data calculations. How can I do this effectively? Thanks!
You are seeing the skip because only one javascript thread is run at once. When something is done asynchronously the javascript engine puts it a queue to be ran later, then finds something else to execute. When something in the queue needs to be done the engine will pull it out and execute it, blocking all other operations until it completes.The engine then pulls something else out of its queue to execute.
So if you want to allow your render to run smoothly you must break up your calculation into multiple async calls, allowing the engine to schedule the render operation in between calculations. This is easy to accomplish if you are just iterating over a array, so you can do something like:
var now=Date.now;
if(window.performance&&performance.now){//use performace.now if we can
now=performance.now;
}
function calculate(){
var batchSize=10;//If you have a exceptionally long operation you may want to make this lower.
var i=0;
var next=function(){
var start=now();
while(now()-start<14){//14ms / frame
var end=Math.min(i+batchSize,data.length);
for(;i<end;i++){//do batches to reduce time overhead
do_calc(data[i]);
}
}
if(i<data.length) setTimeout(next,1)//defer to next tick
};
next();
}
calculate();
function render(){
do_render_stuff();
if(animating) {
requestAnimationFrame(render);//use requestAnimationFrame rather then setTimeout for rendering
}
}
render();
Better yet, if you can, you should use WebWorkers which work in a different thread, completely separate from the main js engine. However you are stuck with this if you need to do something you cant do in a WebWorker, such as manipulating the DOM tree.
Firstly, let's talk what's going on in your code:
while (animating) {
performAnimation();
if (needCalc) {
// it should be setTimeout(calcData, 0);
setTimeout(calcData(), 0);
}
}
In line setTimeout(calcData(), 0); really you don't defer calling of calcData function, you call it, because you use () operator after function name.
Secondly, lets think, what's going on when you really make defer calling for calcData in the code above: commonly JavaScript is running in one thread, so, if you have code like this:
setTimeout(doSomething, 0);
while (true) {};
doSomething will never be called, because interpreter of javascript executes while loop forever and it hasn't "free time" to execute other things (even UI) . setTimeout - just say to schedule execution of doSomething when interpreter will be free and it's time to execute this function.
So, when browser executes javascript function, all other stuff become freezing.
Solution:
If you have big data that you need to process, maybe it would be better to make calculations on backend and after send results to frontend.
Usually when you need to make some calculation and render results it's better to use requestAnimationFrame than while loop. Browser will execute function passed in requestAnimationFrame as soon as possible, but also you give browser a time to handle other events. You can see smooth redrawing using requestAnimationFrame for game (step-by-step tutorial here).
If you really want to process huge amount of data at frontend part and you want to make ui works smooth, you can try to use WebWorkers. WebWorkers look like threads in JavaScript, you need to communicate between main UI "thread" and WebWorker by passing messages from one to another and back and calculations on WebWorker don't affect UI thread.
Mostly, your problem boils down to your incorrect usage of setTimeout()
setTimeout(calcData(), 0);
The first argument to setTimeout is a REFERENCE to the function that you wish to call. In your code, you are not referencing the calcData function, you are invoking it because you've included () after the function name.
Second, the fact that you've put 0 for the delay does not mean you will have a 0 second delay before the function runs. JavaScript runs in a single threaded context. The setTimeout function is placed in a queue and executed when the JavaScript engine is available, but no sooner than a minimum of 10ms or the amount you specify (whichever is less).
Realistically, your line should be:
setTimeout(calcData(),10);
I've been using selenium (with python bindings and through protractor mostly) for a rather long time and every time I needed to execute a javascript code, I've used execute_script() method. For example, for scrolling the page (python):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
Or, for infinite scrolling inside an another element (protractor):
var div = element(by.css('div.table-scroll'));
var lastRow = element(by.css('table#myid tr:last-of-type'));
browser.executeScript("return arguments[0].offsetTop;", lastRow.getWebElement()).then(function (offset) {
browser.executeScript('arguments[0].scrollTop = arguments[1];', div.getWebElement(), offset).then(function() {
// assertions
});
});
Or, for getting a dictionary of all element attributes (python):
driver.execute_script('var items = {}; for (index = 0; index < arguments[0].attributes.length; ++index) { items[arguments[0].attributes[index].name] = arguments[0].attributes[index].value }; return items;', element)
But, WebDriver API also has execute_async_script() which I haven't personally used.
What use cases does it cover? When should I use execute_async_script() instead of the regular execute_script()?
The question is selenium-specific, but language-agnostic.
When should I use execute_async_script() instead of the regular execute_script()?
When it comes to checking conditions on the browser side, all checks you can perform with execute_async_script can be performed with execute_script. Even if what you are checking is asynchronous. I know because once upon a time there was a bug with execute_async_script that made my tests fail if the script returned results too quickly. As far as I can tell, the bug is gone now so I've been using execute_async_script but for months beforehand, I used execute_script for tasks where execute_async_script would have been more natural. For instance, performing a check that requires loading a module with RequireJS to perform the check:
driver.execute_script("""
// Reset in case it's been used already.
window.__selenium_test_check = undefined;
require(["foo"], function (foo) {
window.__selenium_test_check = foo.computeSomething();
});
""")
result = driver.wait(lambda driver:
driver.execute_script("return window.__selenium_test_check;"))
The require call is asynchronous. The problem with this though, besides leaking a variable into the global space, is that it multiplies the network requests. Each execute_script call is a network request. The wait method works by polling: it runs the test until the returned value is true. This means one network request per check that wait performs (in the code above).
When you test locally it is not a big deal. If you have to go through the network because you are having the browsers provisioned by a service like Sauce Labs (which I use, so I'm talking from experience), each network request slows down your test suite. So using execute_async_script not only allows writing a test that looks more natural (call a callback, as we normally do with asynchronous code, rather than leak into the global space) but it also helps the performance of your tests.
result = driver.execute_async_script("""
var done = arguments[0];
require(["foo"], function (foo) {
done(foo.computeSomething());
});
""")
The way I see it now is that if a test is going to hook into asynchronous code on the browser side to get a result, I use execute_async_script. If it is going to do something for which there is no asynchronous method available, I use execute_script.
Here's the reference to the two APIs (well it's Javadoc, but the functions are the same), and here's an excerpt from it that highlights the difference
[executeAsyncScript] Execute an asynchronous piece of JavaScript in
the context of the currently selected frame or window. Unlike
executing synchronous JavaScript, scripts executed with this method
must explicitly signal they are finished by invoking the provided
callback. This callback is always injected into the executed function
as the last argument.
Basically, execSync blocks further actions being performed by the selenium browser, while execAsync does not block and calls on a callback when it's done.
Since you've worked with protractor, I'll use that as example.
Protractor uses executeAsyncScript in both get and waitForAngular
In waitForAngular, protractor needs to wait until angular announces that all events settled. You can't use executeScript because that needs to return a value at the end (although I guess you can implement a busy loop that polls angular constantly until it's done). The way it works is that protractor provides a callback, which Angular calls once all events settled, and that requires executeAsyncScript. Code here
In get, protractor needs to poll the page until the global window.angular is set by Angular. One way to do it is driver.wait(function() {driver.executeScript('return window.angular')}, 5000), but that way protractor would pound at the browser every few ms. Instead, we do this (simplified):
functions.testForAngular = function(attempts, callback) {
var check = function(n) {
if (window.angular) {
callback('good');
} else if (n < 1) {
callback('timedout');
} else {
setTimeout(function() {check(n - 1);}, 1000);
}
};
check(attempts);
};
Again, that requires executeAsyncScript because we don't have a return value immediately. Code here
All in all, use executeAsyncScript when you care about a return value in a calling script, but that return value won't be available immediately. This is especially necessary if you can't poll for the result, but must get the result using a callback or promise (which you must translate to callback yourself).
I'm starting to write a server in node.js and wondering whether or not I'm doing things the right way...
Basically my structure is like the following pseudocode:
function processStatus(file, data, status) {
...
}
function gotDBInfo(dbInfo) {
var myFile = dbInfo.file;
function gotFileInfo(fileInfo) {
var contents = fileInfo.contents;
function sentMessage(status) {
processStatus(myFile, contents, status);
}
sendMessage(myFile.name + contents, sentMessage);
}
checkFile(myFile, gotFileInfo);
}
checkDB(query, gotDBInfo);
In general, I'm wondering if this is the right way to code for node.js, and more specifically:
1) Is the VM smart enough to run "concurrently" (i.e. switch contexts) between each callback to not get hung up with lots of connected clients?
2) When garbage collection is run, will it clear the memory completely if the last callback (processStatus) finished?
Node.js is event-based, all codes are basically handlers of events. The V8 engine will execute-to-end any synchronous code in the handler and then process the next event.
Async call (network/file IO) will post an event to another thread to do the blocking IO (that's in libev libeio AFAIK, I may be wrong on this). Your app can then handle other clients. When the IO task is done, an event is fired and your callback function is called upon.
Here's an example of aync call flow, simulating a Node app handling a client request:
onRequest(req, res) {
// we have to do some IO and CPU intensive task before responding the client
asyncCall(function callback1() {
// callback1() trigger after asyncCall() done it's part
// *note that some other code might have been executed in between*
moreAsyncCall(function callback2(data) {
// callback2() trigger after moreAsyncCall() done it's part
// note that some other code might have been executed in between
// res is in scope thanks to closure
res.end(data);
// callback2() returns here, Node can execute other code
// the client should receive a response
// the TCP connection may be kept alive though
});
// callback1() returns here, Node can execute other code
// we could have done the processing of asyncCall() synchronously
// in callback1(), but that would block for too long
// so we used moreAsyncCall() to *yield to other code*
// this is kind of like cooperative scheduling
});
// tasks are scheduled by calling asyncCall()
// onRequest() returns here, Node can execute other code
}
When V8 does not have enough memory, it will do garbage collection. It knows whether a chunk of memory is reachable by live JavaScript object. I'm not sure if it will aggressively clean up memory upon reaching end of function.
References:
This Google I/O presentation discussed the GC mechanism of Chrome (hence V8).
http://platformjs.wordpress.com/2010/11/24/node-js-under-the-hood/
http://blog.zenika.com/index.php?post/2011/04/10/NodeJS