Javascript internals - clearTimeout just before it fires - javascript

Let's say I do this:
var timer = setTimeout(function() {
console.log("will this happen?");
}, 5000);
And then after just less than 5 seconds, another callback (from a network event in NodeJS for example) fires and clears it:
clearTimeout(timer);
Is there any possibility that the callback from the setTimeout call is already in the queue to be executed at this point, and if so will the clearTimeout be in time to stop it?
To clarify, I am talking about a situation where the setTimeout time actually expires and the interpreter starts the process of executing it, but the other callback is currently running so the message is added to the queue. It seems like one of those race condition type things that would be easy to not account for.

Even though Node is single thread, the race condition the question describes is possible.
It can happen because timers are triggered by native code (in lib_uv).
On top of that, Node groups timers with the same timeout value. As a result, if you schedule two timers with the same timeout within the same ms, they will be added to the event queue at once.
But rest assured node internally solves that for you. Quoting code from node 0.12.0:
timer.js > clearTimeout
exports.clearTimeout = function(timer) {
if (timer && (timer[kOnTimeout] || timer._onTimeout)) {
timer[kOnTimeout] = timer._onTimeout = null;
// ...
}
}
On clearing a timeout, Node internally removes the reference to the callback function. So even if the race condition happens, it can do no harm, because those timers will be skipped:
listOnTimeout
if (!first._onTimeout) continue;

Node.js executes in a single thread.
So there cannot be any race conditions and you can reliably cancel the timeout before it triggers.
See also a related discussion (in browsers).
I am talking about a situation where the setTimeout time actually expires and the interpreter starts the process of executing it
Without having looked at Node.js internals, I don't think this is possible. Everything is single-threaded, so the interpreter cannot be "in the process" of doing anything while your code is running.
Your code has to return control before the timeout can be triggered. If you put an infinite loop in your code, the whole system hangs. This is all "cooperative multitasking".

This behavior is defined in the HTML Standard, the fired task starts with:
If the entry for handle in the list of active timers has been cleared, then abort these steps.
Therefore even if the task has been queued already, it'll be aborted.
Whether this applies to Node.js, however, is debatable, as the documentation just states:
The timer functions within Node.js implement a similar API as the timers API provided by Web Browsers but use a different internal implementation that is built around the Node.js Event Loop.

Related

How can a script and setTimeout/setInterval work together in NodeJS?

Reading through the NodeJS Event Loop description I wonder how setTimeout and setInterval can actually work.
The page says NodeJS first runs the given script (let REPL alone for now) and then enters the event loop. But what if I call setTimeout in that script and expect it to trigger while the script is still running? Isn't that the normal case actually? According to the description the timer callback will not be triggered before the main script ends, which sounds really weird to me.
For those interested, here's the NodeJS outer even loop (there are actually 2 nested loops): https://github.com/nodejs/node/blob/master/src/node.cc#L4526
let's do this by example
setTimeout(function(){
print('there');
});
print('hi');
this will print hi then there
here's what happen
the script will be proccessed until last line and when ever it finds a timer function
it will add it to a queue which will be handled later at the end of the execution by the queue scheduler
loop queue => [ setTimeout ]
before exit there should be a scheduler, some kind of a loop to check if we
have something in the queue and handle them, then once queue is out of all timers the loop
will exit.
let's suppose we call setTimeout inside setInterval
setInterval(function(){
setTimeout(function(){
print('hi')
}, 500);
}, 1000);
loop queue => [ setInterval ]
after 1000 ms
setInterval will be fired and the inner setTimeout will be added to the queue
loop queue => [ setTimeout, setInterval ]
now we get back to the main loop which will wait for another 500 ms
an fire the inner setTimeout function, then remove it from the queue
because setTimeout should be run once.
loop queue => [ setInterval ]
back to the main loop, we still have items in the queue, so it will wait
another 500 ms and fire again ( 500 + 500 = 1000 ms)
the inner setTimeout function will be added to the queue again
loop queue => [ setTimeout, setInterval ]
back to the main queue agin and again ...
Now this is simply how timers work, they are not meant to handle blocking code, it's
a way to run code at some intervals
setInterval(function(){
// do something long running here
while (1) {}
setTimeout(function(){
print('hi')
}, 500);
}, 1000);
main loop will block here and the inner timeout will not be added to the queue, so this
is a bad idea
nodejs and event loop in general are good with network operations because they don't block when
used with select for example.
setInterval(function(){
// check if socket has something
if (socketHasData( socket )){
processSocketData( socket );
}
// do something else that does not block
// maybe schedule another timer here
print('hello');
}, 1000);
libuv which is the event loop used in nodejs, uses threads to handle some
blocking operations like IO operations, open/read/write files
[EDIT] humm re-reading your initial post, I think I know what bugs you. You mentioned nodejs in your post, implying you might be coding a server.
If you are not really familiar with server side JavaScript and more used to php server for example it might be very confusing indeed.
With a php server, a request creates a new thread that will handle it and when the main script (as you call it) ends, then the thread is killed and nothing else runs on the server (except for the webserver that listens to request, like nginx or apache).
With nodejs, it's different. The main thread is alone and always running. So when a request arrives, callbacks are fired but they are still in that single thread. Said otherwise: the main script never ends (except when you kill it or that your server crashes :) )
Well, that is accurate. Because of the single-threaded nature of JavaScript, if a timer ends while the main thread is busy, the timer's callback will wait.
When you do
setTimeout(callback, 1000)
You are not saying "I want this callback to be called in exactly 1s" but actually "I want this callback to be called in, at least, 1s"
This article by John Resig is an excellent read and goes through the details of the JavaScript's timers https://johnresig.com/blog/how-javascript-timers-work/
But what if I call setTimeout in that script and expect it to trigger while the script is still running?
You don't expect that. You expect your synchronous code run to completion way before the timeout occurs.
If the script is still running, because it's doing something blocking - it hangs - then the timeout callback doesn't get a chance to execute, it will wait. That's exactly why we need to write non-blocking code.
Isn't that the normal case actually?
No. Most of the time no JS is executing, the event loop is idling (while there might be background tasks doing the heavy lifting).
Given that Node is single threaded, it (v8 engine) always executes the current script before moving on to the next one. So when we start a node server with a main script, it loads, parses, compiles and executes that script first, before it runs anything else. Only if the current running script hits an I/O call it gets bumped out to the back of the event loop, giving other scripts or setTimeout callbacks a chance to execute. This is the very nature of JavaScript engine and the reason Node is not considered good for long running, in-memory CPU intensive tasks.
As #atomrc said in his answer, setTimeout and setInterval are just a hint to node to run the callbacks after the timeout period, there are no guarantees.

setTimeOut() and setInterval() in Node [duplicate]

This question already has answers here:
How does setTimeout work in Node.JS?
(5 answers)
Closed 6 years ago.
If I call setTimeOut() for say, 10 seconds from now, then execute a set of long running commands, does Node.js/JavaScript wait until those commands finish executing the function set up in setTimeOut? Is the same true with setInterval()?
Are there any things to watch out for if I'm suing both setTimeOut() and setInterval() in the code where tasks may end up being executed around the same time?
I'm using the node-cron (https://github.com/kelektiv/node-cron/blob/master/lib/cron.js) library and I see that it uses setTimeOut. I'm trying to add some tasks using setInterval().
Timer events in node.js are not guaranteed to be called at an accurate time.
If I call setTimeOut() for say, 10 seconds from now, then execute a
set of long running commands, does Node.js/JavaScript wait until those
commands finish executing the function set up in setTimeOut?
Yes, it waits until the current code executing in node.js is done before it can serve the next timer event.
Is the same true with setInterval()?
Yes, same mechanism for setInterval().
Here's some explanation of how the node.js system works.
node.js is a single threaded event-driven system (technically threads are used inside of node.js, but it only runs one single thread of your JS code).
When you use setTimeout() or setInterval() some internal mechanism inside of node.js uses system timers to know when the next timer should fire. At that moment, an event is inserted into the node.js event queue. If node.js is doing nothing at that moment, then the event is triggered immediately and the appropriate callback function is called immediately.
But, if node.js is busy running code and if other events are in front of the timer event in the event queue, then the timer event will not be triggered immediately.
Instead, node.js will wait until the current thread of execution in node.js is done and then, and only then, the next event in the event queue will be pulled out and the appropriate callback for that event will get called.
So, if you have some piece of long running node.js code, it will block all other events (including timer events) until it is done and node.js can get back to pulling the next event out of the event queue.
The answer is: maybe but probably not.
When things are async, there isn't necessarily any guarantee what's going to get called when. When async things are called, it'll get added to an internal queue and it'll process the queue. Odds are, they will finish, but it is not something you should rely on.
Instead, what you should do is trigger something to explicitly indicate that it finished. There are lots of ways to go about it, such as callbacks and Promises. You could even set a boolean that indicates the state and check it before the dependent step.
let aDone = false;
setTimeout(() => { aDone = true; }, 1000);
const startInterval = () => {
if (!aDone) {
setTimeout(startInterval, 200); // try again in 200ms
}
setInterval(() => { /* do something */ }, 1000);
};
startInterval(); // kick off the interval check

Browser Javascript: setTimeout and the Main Program

When running in a browser, will setTimeout ever fire its code before the main program is done executing? Have the major browser vendors agreed on this behavior, or is it a side-effect of implementation? (or have they agreed to keep this side-effect in as standard behavior)
Consider a very simple (and useless) program.
setTimeout(function(){
console.log("Timeout Called")
},1);
for(var i=0;i<10000000;i++){};
console.log("done");
First we set a single micro-second setTimeout callback function which outputs Timeout Called to the console.
Then we spin in a loop for more than a micro-second.
Then we output done to the console.
When I run this program, it always outputs
done
Timeout Called
That is, the setTimeout callback functions aren't considered until the main program has run.
Is this reliable, defined behavior? Or are there times where the main program execution will be halted, the callback run, and then main program execution continued.
Yes, it is defined behaviour. It is a common misconception that Ajax callbacks undeterministically execute at some time, possibly before the current execution path finishes, when in reality they will always execute some time afterwards.
Javascript is single threaded and will never return to the event loop until the current thread finishes executing completely.
An asynchronous function, such as an event handler (includes Ajax) or a function that is scheduled with setInterval/setTimeout will never execute before the current execution path completes.
This is very well defined behavior.
The browser is not async and still has to wait for a previous action to complete before it does the next action
When using timeOut, it will wait first for the number of milliseconds that you passed, and then it will continue to wait until there is an opening in the code. Generally, this means it will wait until the code is done. The only exception (sort of) is when using other timeOuts or setIntervals. For example, if your loop had been
for(var i=0;i<10000000;i++){
setTimeout(function () {
console.log('One iteration');
}, 15);
};
Your output would be
done
Timeout Called
One iteration
One iteration
And so on.

Can JavaScript's setInterval block thread execution?

Can setInterval result in other scripts in the page blocking?
I'm working on a project to convert a Gmail related bookmarklet into a Google Chrome extension. The bookmarklet uses the gmail greasemonkey API to interact with the gmail page. The JavaScript object for the API is one of the last parts of the Gmail page to load and is loaded into the page via XMLHttpRequest. Since I need access to this object, and global JavaScript variables are hidden from extension content scripts, I inject a script into the gmail page that polls for the variable's definition and then accesses it. I'm doing the polling using the setInterval function. This works about 80% of the time. The rest of the time the polling function keeps polling until reaching a limit I set and the greasemonkey API object is never defined in the page.
Injected script sample:
var attemptCount = 0;
var attemptLoad = function(callback) {
if (typeof(gmonkey) != "undefined"){
clearInterval(intervalId); // unregister this function for interval execution.
gmonkey.load('1.0', function (gmail) {
self.gmail = gmail;
if (callback) { callback(); }
});
}
else {
attemptCount ++;
console.log("Gmonkey not yet loaded: " + attemptCount );
if (attemptCount > 30) {
console.log("Could not fing Gmonkey in the gmail page after thirty seconds. Aborting");
clearInterval(intervalId); // unregister this function for interval execution.
};
}
};
var intervalId = setInterval(function(){attemptLoad(callback);}, 1000);
Javascript is single threaded (except for web workers which we aren't talking about here). That means that as long as the regular javascript thread of execution is running, your setInterval() timer will not run until the regular javascript thread of execution is done.
Likewise, if your setInterval() handler is executing, no other javascript event handlers will fire until your setInterval() handler finishes executing it's current invocation.
So, as long as your setInterval() handler doesn't get stuck and run forever, it won't block other things from eventually running. It might delay them slightly, but they will still run as soon as the current setInterval() thread finishes.
Internally, the javascript engine uses a queue. When something wants to run (like an event handler or a setInterval() callback) and something is already running, it inserts an event into the queue. When the current javascript thread finishes execution, the JS engine checks the event queue and if there's something there, it picks the oldest event there and calls its event handler.
Here are a few other references on how the Javascript event system works:
How does JavaScript handle AJAX responses in the background?
Are calls to Javascript methods thread-safe or synchronized?
Do I need to be concerned with race conditions with asynchronous Javascript?
setInterval and setTimeout are "polite", in that they don't fire when you think they would -- they fire any time the thread is clear, after the point you specify.
As such, the act of scheduling something won't stop something else from running -- it just sets itself to run at the end of the current queue, or at the end of the specified time (whichever is longer).
Two important caveats:
The first would be that setTimeout/setInterval have browser-specific minimums. Frequently, they're around 15ms. So if you request something every 1ms, the browser will actually schedule them to be every browser_min_ms (not a real variable) apart.
The second is that with setInterval, if the script in the callback takes LONGER than the interval, you can run into a trainwreck where the browser will keep queuing up a backlog of intervals.
function doExpensiveOperation () {
var i = 0, l = 20000000;
for (; i < l; i++) {
doDOMStuffTheWrongWay(i);
}
}
setInterval(doExpensiveOperation, 10);
BadTimes+=1;
But for your code specifically, there's nothing inherently wrong with what you're doing.
Like I said, setInterval won't abort anything else from happening, it'll just inject itself into the next available slot.
I would probably recommend that you use setTimeout, for general-purpose stuff, but you're still doing a good job of keeping tabs of the interval and keeping it spaced out.
There may be something else going on, elsewhere in the code -- either in Google's delivery, or in your collection.

Is there a general mechanism to timeout events in node.js?

I am learning node.js and most of examples I can find are dealing with simple examples. I am more interested in building real-world complicated systems and estimating how well event based model of node.js can handle all the use cases of a real application.
One of the common patterns that I want to apply is let blocking execution to time-out if it does not occur within certain timeout time. For example if it takes more than 30 seconds to execute a database query, it might be too much for certain application. Or if it takes more than 10 seconds to read a file.
For me the ideal program flow with timeouts would be similar to the program flow with exceptions. If an event does not occur within certain predefined timeout limit, then the event listener would be cleared from the event loop and a timeout event would be generated instead. This timeout event would have an alternate listener. If the event is handled normally, then both the timeout listener and event listener are cleared from the event loop.
Is there a general mechanism for timeout handling and cleaning up timed out processes? I know some types such as socket have timeout parameter but it is not a general mechanism that applies to all events.
There is nothing like this at the moment (that i know of, but i don't know everything).
The only thing i can think of is that you reset it yourself somehow. I've given an example below but I think it may have some scope issues. Should be solvable though.
var to
function cb() {
clearTimeout(to)
// do stuff
}
function cbcb() {
cb()
}
function cancel() {
cb = function() {} // notice empty
}
fs.doSomethingAsync(file, cbcb)
to = setTimeout(cancel, 10000)

Categories