Does setTimeout function affect the performance of Node.js application?

Does setTimeout function affect the performance of Node.js application? - javascript

I have a request handler for ticket booking:
route.post('/makeBooking', (req, res) => {
// Booking code
setTimeout(function () {
// Checks if the payment is made, if not then cancels the booking
}, 900000);
});
Now I've a route which makes a booking and if the payment is not made within 15 minutes the timeout function will cancel the booking.
Will this function cause any performance related issues or memory leaks?

Will this function cause any performance related issues ...
No it won't, at least not in and by itself. While setTimeout is waiting to invoke it's callback, it's non-blocking. The call is simply added to a queue. At some point in the future the callback fires and the call is removed from that queue.
In the meantime, you can still process stuff.
... or memory leaks?
The setTimeout callback is within a closure. As soon as setTimeout invokes the callback, it becomes eligible for garbage collection.
Unless you get many millions of bookings within the 900000ms timeframe, you have nothing to worry about; the number of course depends on the memory
size you allocated to your Node.js application.
Of course if you do get that many requests-per-second, you have other, more important stuff to worry about.

It won't have performance issues or memory leak problems, but using a 15 minutes timeout function might be problematic for debugging and maintaining.
Especially something like cancelling a booking should be solved another way.
You always should write your node application in a way that:
You can run it in cluster mode (Even if you don't need it for performance reasons)
That a crash of the application and immediate restart would not cause much information loss (a 15 minute timeout could be problematic here)
That a crash or restart of the application will not result in an inconsistent state (e.g. a pending booking that would not be cancelled anymore)
So assuming that you already use a database for the booking process, then you should also do the 15 minute timing within the database.

Related

What would happen if a variable were manipulated more than once at the exact same time? Is it possible? [duplicate]

Lets assume I run this piece of code.
var score = 0;
for (var i = 0; i < arbitrary_length; i++) {
async_task(i, function() { score++; }); // increment callback function
}
In theory I understand that this presents a data race and two threads trying to increment at the same time may result in a single increment, however, nodejs(and javascript) are known to be single threaded. Am I guaranteed that the final value of score will be equal to arbitrary_length?

Am I guaranteed that the final value of score will be equal to
arbitrary_length?
Yes, as long as all async_task() calls call the callback once and only once, you are guaranteed that the final value of score will be equal to arbitrary_length.
It is the single-threaded nature of Javascript that guarantees that there are never two pieces of Javascript running at the exact same time. Instead, because of the event driven nature of Javascript in both browsers and node.js, one piece of JS runs to completion, then the next event is pulled from the event queue and that triggers a callback which will also run to completion.
There is no such thing as interrupt driven Javascript (where some callback might interrupt some other piece of Javascript that is currently running). Everything is serialized through the event queue. This is an enormous simplification and prevents a lot of stickly situations that would otherwise be a lot of work to program safely when you have either multiple threads running concurrently or interrupt driven code.
There still are some concurrency issues to be concerned about, but they have more to do with shared state that multiple asynchronous callbacks can all access. While only one will ever be accessing it at any given time, it is still possible that a piece of code that contains several asynchronous operations could leave some state in an "in between" state while it was in the middle of several async operations at a point where some other async operation could run and could attempt to access that data.
You can read more about the event driven nature of Javascript here: How does JavaScript handle AJAX responses in the background? and that answer also contains a number of other references.
And another similar answer that discusses the kind of shared data race conditions that are possible: Can this code cause a race condition in socket io?
Some other references:
how do I prevent event handlers to handle multiple events at once in javascript?
Do I need to be concerned with race conditions with asynchronous Javascript?
JavaScript - When exactly does the call stack become "empty"?
Node.js server with multiple concurrent requests, how does it work?
To give you an idea of the concurrency issues that can happen in Javascript (even without threads and without interrupts, here's an example from my own code.
I have a Raspberry Pi node.js server that controls the attic fans in my house. Every 10 seconds it checks two temperature probes, one inside the attic and one outside the house and decides how it should control the fans (via relays). It also records temperature data that can be presented in charts. Once an hour, it saves the latest temperature data that was collected in memory to some files for persistence in case of power outage or server crash. That saving operation involves a series of async file writes. Each one of those async writes yields control back to the system and then continues when the async callback is called signaling completion. Because this is a low memory system and the data can potentially occupy a significant portion of the available RAM, the data is not copied in memory before writing (that's simply not practical). So, I'm writing the live in-memory data to disk.
At any time during any of these async file I/O operations, while waiting for a callback to signify completion of the many file writes involved, one of my timers in the server could fire, I'd collect a new set of temperature data and that would attempt to modify the in-memory data set that I'm in the middle of writing. That's a concurrency issue waiting to happen. If it changes the data while I've written part of it and am waiting for that write to finish before writing the rest, then the data that gets written can easily end up corrupted because I will have written out one part of the data, the data will have gotten modified from underneath me and then I will attempt to write out more data without realizing it's been changed. That's a concurrency issue.
I actually have a console.log() statement that explicitly logs when this concurrency issue occurs on my server (and is handled safely by my code). It happens once every few days on my server. I know it's there and it's real.
There are many ways to work around those types of concurrency issues. The simplest would have been to just make a copy in memory of all the data and then write out the copy. Because there are not threads or interrupts, making a copy in memory would be safe from concurrency (there would be no yielding to async operations in the middle of the copy to create a concurrency issue). But, that wasn't practical in this case. So, I implemented a queue. Whenever I start writing, I set a flag on the object that manages the data. Then, anytime the system wants to add or modify data in the stored data while that flag is set, those changes just go into a queue. The actual data is not touched while that flag is set. When the data has been safely written to disk, the flag is reset and the queued items are processed. Any concurrency issue was safely avoided.
So, this is an example of concurrency issues that you do have to be concerned about. One great simplifying assumption with Javascript is that a piece of Javascript will run to completion without any thread of getting interrupted as long as it doesn't purposely return control back to the system. That makes handling concurrency issues like described above lots, lots easier because your code will never be interrupted except when you consciously yield control back to the system. This is why we don't need mutexes and semaphores and other things like that in our own Javascript. We can use simple flags (just a regular Javascript variable) like I described above if needed.
In any entirely synchronous piece of Javascript, you will never be interrupted by other Javascript. A synchronous piece of Javascript will run to completion before the next event in the event queue is processed. This is what is meant by Javascript being an "event-driven" language. As an example of this, if you had this code:
console.log("A");
// schedule timer for 500 ms from now
setTimeout(function() {
console.log("B");
}, 500);
console.log("C");
// spin for 1000ms
var start = Date.now();
while(Data.now() - start < 1000) {}
console.log("D");
You would get the following in the console:
A
C
D
B
The timer event cannot be processed until the current piece of Javascript runs to completion, even though it was likely added to the event queue sooner than that. The way the JS interpreter works is that it runs the current JS until it returns control back to the system and then (and only then), it fetches the next event from the event queue and calls the callback associated with that event.
Here's the sequence of events under the covers.
This JS starts running.
console.log("A") is output.
A timer event is schedule for 500ms from now. The timer subsystem uses native code.
console.log("C") is output.
The code enters the spin loop.
At some point in time part-way through the spin loop the previously set timer is ready to fire. It is up to the interpreter implementation to decide exactly how this works, but the end result is that a timer event is inserted into the Javascript event queue.
The spin loop finishes.
console.log("D") is output.
This piece of Javascript finishes and returns control back to the system.
The Javascript interpreter sees that the current piece of Javascript is done so it checks the event queue to see if there are any pending events waiting to run. It finds the timer event and a callback associated with that event and calls that callback (starting a new block of JS execution). That code starts running and console.log("B") is output.
That setTimeout() callback finishes execution and the interpreter again checks the event queue to see if there are any other events that are ready to run.

Node uses an event loop. You can think of this as a queue. So we can assume, that your for loop puts the function() { score++; } callback arbitrary_length times on this queue. After that the js engine runs these one by one and increase score each time. So yes. The only exception if a callback is not called or the score variable is accessed from somewhere else.
Actually you can use this pattern to do tasks parallel, collect the results and call a single callback when every task is done.
var results = [];
for (var i = 0; i < arbitrary_length; i++) {
async_task(i, function(result) {
results.push(result);
if (results.length == arbitrary_length)
tasksDone(results);
});
}

No two invocations of the function can happen at the same time (b/c node is single threaded) so that will not be a problem. The only problem would be ifin some cases async_task(..) drops the callback. But if, e.g., 'async_task(..)' was just calling setTimeout(..) with the given function, then yes, each call will execute, they will never collide with each other, and 'score' will have the value expected, 'arbitrary_length', at the end.
Of course, the 'arbitrary_length' can't be so great as to exhaust memory, or overflow whatever collection is holding these callbacks. There is no threading issue however.

I do think it’s worth noting for others that view this, you have a common mistake in your code. For the variable i you either need to use let or reassign to another variable before passing it into the async_task(). The current implementation will result in each function getting the last value of i.

IndexedDb transaction auto-commit behavior in edge cases

Tx is committed when :
request success callback returns
- that means that multiple requests can be executed within transaction boundaries only when next request is executed from success callback of the previous one
when your task returns to event loop
It means that if no requests are submitted to it, it is not committed until it returns to event loop. These facts pose 2 problematic states :
placing a new IDB request by enqueuing a new task to event loop queue from within the success callback of previous request instead of submitting new request synchronously
in that case the first success callback immediately returns but another IDB request has been scheduled
are all the asynchronous requests executed within the single initial transaction? This is quite essential in case you want to implement result pulling with back-pressure where consumer gives you a feedback in form of a Future that it is ready to consume another response
creating a ReadWrite tx, not placing any requests against it and creating another one before returning to event loop
does creating a new one implicitly commits the previous tx ? If not, serious write lock starvations might occur, because :
If multiple "readwrite" transactions are attempting to access the same
object store (i.e. if they have overlapping scope), the transaction
that was created first must be the transaction which gets access to
the object store first. Due to the requirements in the previous
paragraph, this also means that it is the only transaction which has
access to the object store until the transaction is finished.
The example of enqueuing a new task to event loop queue from within the success callback by recursive request submission with back-pressure :
function recursiveFn(key) {
val req = store.get(key)
req.onsuccess = function() {
observer.onNext(req.result).onsuccess { recursiveFn(nextKey) }
}
}
Observer#onNext // returns Future[Ack] Ack is either Continue or Cancel
Now can onsuccess or onNext do a setTimeout(0) or not to make the whole thing be part of one transaction?
Bonus question :
I think that ReadOnly transactions are exposed to the consumer/user just because it would be hard to detect the end of a batch read if you recursively submit new requests from the success callback of the previous one right? Otherwise I don't see any other reason for them to be exposed to a user, right ?

I'm not sure I understand your question completely but I'll offer an answer on whether you can safely use IDB transaction events to move a state machine.
Yes and no. Yes in theory, no in practice.
I think you understand the transaction lifetime. But to rehash:
The lifetime of a transactions lasts as long as it's referenced: it's "active" so long as it's being referenced, after which it is said to be "finished" and the transaction is committed.
In theory, oncomplete should fire whenever a transaction successfully commits. There's a useful tip in the spec on this that suggests how you could loop:
To determine if a transaction has completed successfully, listen to the transaction’s complete event rather than the IDBObjectStore.add request’s success event, because the transaction may still fail after the success event fires.
To safely use this mechanism be sure to watch for non-success events including onblocked and onabort as well.
Practically speaking, I've found transactions to be flakey when long-lived or done consecutively in batches (as you've noted in another IDB comment). I'm generally not engineering my apps to require tricky behavior because, no matter how the spec says it should behavior, I'm seeing wonky transactions in both Firefox and Chromium (but mostly Blink, interestingly) especially when multiple tabs are open.
I spent many weeks rewriting dash to reuse transactions for supposed performance gains. In the end it could not pass even my basic write tests and I was forced to abandon simultaneous/queued/consecutive transactions and rewrite once again. This time I picked a one-transaction-at-a-time model which is slower but, for me, more reliable (and suggest to avoid my lib and use something like ydn for bulk inserts).
I'm not sure on your application requirements, but in my humble opinion tying in your I/O into your event loop seems like a disastrous idea. If I needed an event loop as what I understand to be the term I would definitely use requestAnimationFrame() and throttle that callback if I needed fewer ticks than one per ~33 milliseconds.

Javascript internals - clearTimeout just before it fires

Let's say I do this:
var timer = setTimeout(function() {
console.log("will this happen?");
}, 5000);
And then after just less than 5 seconds, another callback (from a network event in NodeJS for example) fires and clears it:
clearTimeout(timer);
Is there any possibility that the callback from the setTimeout call is already in the queue to be executed at this point, and if so will the clearTimeout be in time to stop it?
To clarify, I am talking about a situation where the setTimeout time actually expires and the interpreter starts the process of executing it, but the other callback is currently running so the message is added to the queue. It seems like one of those race condition type things that would be easy to not account for.

Even though Node is single thread, the race condition the question describes is possible.
It can happen because timers are triggered by native code (in lib_uv).
On top of that, Node groups timers with the same timeout value. As a result, if you schedule two timers with the same timeout within the same ms, they will be added to the event queue at once.
But rest assured node internally solves that for you. Quoting code from node 0.12.0:
timer.js > clearTimeout
exports.clearTimeout = function(timer) {
if (timer && (timer[kOnTimeout] || timer._onTimeout)) {
timer[kOnTimeout] = timer._onTimeout = null;
// ...
}
}
On clearing a timeout, Node internally removes the reference to the callback function. So even if the race condition happens, it can do no harm, because those timers will be skipped:
listOnTimeout
if (!first._onTimeout) continue;

Node.js executes in a single thread.
So there cannot be any race conditions and you can reliably cancel the timeout before it triggers.
See also a related discussion (in browsers).
I am talking about a situation where the setTimeout time actually expires and the interpreter starts the process of executing it
Without having looked at Node.js internals, I don't think this is possible. Everything is single-threaded, so the interpreter cannot be "in the process" of doing anything while your code is running.
Your code has to return control before the timeout can be triggered. If you put an infinite loop in your code, the whole system hangs. This is all "cooperative multitasking".

This behavior is defined in the HTML Standard, the fired task starts with:
If the entry for handle in the list of active timers has been cleared, then abort these steps.
Therefore even if the task has been queued already, it'll be aborted.
Whether this applies to Node.js, however, is debatable, as the documentation just states:
The timer functions within Node.js implement a similar API as the timers API provided by Web Browsers but use a different internal implementation that is built around the Node.js Event Loop.

Batch Backbone.js events?

My app's framework is built around collapsing backbone models sending the data via websockets and updating models on other clients with the data. My question is how should I batch these updates for times when an action triggers 5 changes in a row.
The syncing method is set up to update on any change but if I set 5 items at the same time I don't want it to fire 5 times in a row.
I was thinking I could do a setTimeout on any sync that gets cleared if something else tries to sync within a second of it. Does this seem like the best route or is there a better way to do this?
Thanks!

i haven't done this with backbone specifically, but i've done this kind of batching of commands in other distributed (client / server) apps in the past.
the gist of it is that you should start with a timeout and add a batch size for further optimization, if you see the need.
say you have a batch size of 10. what happens when you get 9 items stuffed into the batch and then the user just sits there and doesn't do anything else? the server would never get notified of the things the user wanted to do.
timeout generally works well to get small batches. but if you have an action that generates a large number of related commands you may want to batch all of the commands and send them all across as soon as they are ready instead of waiting for a timer. the time may fire in the middle of creating the commands and split things apart in a manner that causes problems, etc.
hope that helps.

Underscore.js, the utility library that Backbone.js uses, has several functions for throttling callbacks:
throttle makes a version of a function that will execute at most once every X milliseconds.
debounce makes a version of a function that will only execute if X milliseconds elapse since the last time it was called
after makes a version of a function that will execute only after it has been called X times.
So if you know there are 5 items that will be changed, you could register a callback like this:
// only call callback after 5 change events
collection.on("change", _.after(5, callback));
But more likely you don't, and you'll want to go with a timeout approach:
// only call callback 30 milliseconds after the last change event
collection.on("change", _.debounce(30, callback));

Is there a general mechanism to timeout events in node.js?

I am learning node.js and most of examples I can find are dealing with simple examples. I am more interested in building real-world complicated systems and estimating how well event based model of node.js can handle all the use cases of a real application.
One of the common patterns that I want to apply is let blocking execution to time-out if it does not occur within certain timeout time. For example if it takes more than 30 seconds to execute a database query, it might be too much for certain application. Or if it takes more than 10 seconds to read a file.
For me the ideal program flow with timeouts would be similar to the program flow with exceptions. If an event does not occur within certain predefined timeout limit, then the event listener would be cleared from the event loop and a timeout event would be generated instead. This timeout event would have an alternate listener. If the event is handled normally, then both the timeout listener and event listener are cleared from the event loop.
Is there a general mechanism for timeout handling and cleaning up timed out processes? I know some types such as socket have timeout parameter but it is not a general mechanism that applies to all events.

There is nothing like this at the moment (that i know of, but i don't know everything).
The only thing i can think of is that you reset it yourself somehow. I've given an example below but I think it may have some scope issues. Should be solvable though.
var to
function cb() {
clearTimeout(to)
// do stuff
}
function cbcb() {
cb()
}
function cancel() {
cb = function() {} // notice empty
}
fs.doSomethingAsync(file, cbcb)
to = setTimeout(cancel, 10000)

We Keep Coding

JavaScript is the programming language of the Web.