Set UP:
Hi. I am trying to learn about creating/instantiating objects. I was thinking I should be able to create multiple objects that may have different amounts of similar works (like gather news articles) and would "report completion" regardless of the order created. So far I am not clear on this, so here is a basic example followed by the stated question:
function test(count){
this.count = count;
for(var i = 0; i< count; i++){}
console.log(i);
}
new test(1000);
new test(10);
Actual Question:
Based on the code above, I would expect the second instance to print first, but it does not. What would be the correct way to set this up so that which ever object finishes its work would print first?
* Modify my question *
Sorry...what I am really intending to ask is how to set objects to have more of an asynchronous behavior. I am new to Stack, so please let me know if I should close/move this question.
In general JavaScript is uses a synchronous execution model, the event queue. All calls are placed here, basically in the order they appear in your source code (respecting the scope they are in).
So if you start some function, nothing else is executed until that very function is finished. In your case you place both calls on the event queue, but the second call will only be executed, when the first one has finished.
There, however, exceptions: Worker or AJAX requests.
Here the execution is outside the usual event queue and you use message handlers or callbacks to use the results, after the execution is finished. In most cases, however, you can't be sure in which order the calls are finished as there are many circumstances affecting the ordering of execution (network delay, cpu usage, etc.)
In your case, it seems, you want to load external resources, anyway. So have a look at how AJAX works.
You're not creating different parallel execution threads : the first test(1000) is wholly executed before the following line even starts.
In Javascript, when you're not using webworkers (which you should probably not use), all your code is always executed in a single thread. This thread is awaken by the browser on events and goes back to sleep when the function called by the browser returns.
Note that even with threads, even in naturally parallel languages, you hardly would have had any guarantee regarding what loop ends first.
The reason why the second instance does not print first is that the first call new test(1000) runs to completion before the next line new test(10) starts executing.
If you want the second call to run first, swap the two lines...
Edit:
By the way, any decent compiler will remove this line completely:
for(var i = 0; i< count; i++){}
So you can't even expect the first call to take more time than the second...
The second instance gets instantiated only after first object is instantiated.
So it first prints 1000 after it comes to new test(10)
Because it is single threaded program
Related
I know messages come into call stack from the queue when call stack is empty. Wouldn't it be better though, if event loop could push messages from queue directly to call stack without waiting? What reasons are behind this behavior? If the event loop would push a message at an exact time we could always rely on function such as setTimeout etc.
setTimeout(() => console.log("I want to be logged for 10ms, but I will never be :("), 10);
// some blocking operations
for(let i = 0; i < 500000000; i++){
Math.random() * 2 + 2 - 3;
}
console.log("I'll be logged first lol");
It'll probably never be changed due to consistency reason but I'm still curious. Maybe I'm not seeing something, and there is the serious technical reason behind the concept of waiting for an empty stack. Do you have access to some articles about architectural decisions in JS, or maybe you know fundamental examples when this behavior is necessary? There are many articles about how JS works, but I couldn't find anything like "Why event loop works exactly that way". Any help would be greatly appreciated.
V8 developer here. This question seems to be based on a misunderstanding of what "the call stack" is: it is not a data structure that anyone can just push things onto. Instead, it is a term for the current state of things when a bunch of functions have called each other. The only way to "push" another function onto the call stack is when the currently executing function calls it. If the event system inserted random calls at random places into your functions, that would lead to a pretty weird programming model.
You could design a programming environment that's conceptually similar, but rather than pushing anything onto the call stack, what it would do is interrupt and suspend whatever is currently executing, and execute a setTimeout-scheduled function (or event handler, etc) instead, and resume previous execution afterwards. One issue you'd have to solve is: what if this repeats, i.e. what if the scheduled function is interrupted by another scheduled function which is interrupted by another scheduled function, and so on? What if a scheduled function takes forever to finish: when does the previously executing code get to make progress again? Also, while this can be done in a single-threaded world, getting random interruptions is concurrency (which from a consistency point of view is equivalent to parallelism/multi-threading), so you'd need synchronization primitives like locks (essentially, have a way for functions to say "don't interrupt this section" -- which in turn means you can't actually guarantee the accuracy of scheduling requests). Don't underestimate the complexity cost all this would impose on programmers: when writing code, they'd have to keep in mind that anything could get interrupted at any time, and on the flip side that any data that one function might want to process might not be ready yet because another function that produced it hasn't finished running yet.
So in short, JavaScript's event loop system is what it is because the language avoids concurrency, and randomly interrupting functions to execute others is concurrency, even on a single-threaded system.
I got confused when I read that when I set the (same) time in 2 timeSetout methods, the order in which the function will be called cannot always be predicted (there was not reasoning provided).
setTimeout(()=> console.log("first"), 5000);
setTimeout(()=> console.log("second"), 5000);
As I understand, the timesetouts will be added to the event table, which keeps track of the events (in this case the event is the time), then it sends it to the event queue, which stores the order in which the functions should be executed, then event loop checks and sends those functions to the call stack.
Since the first timesetout (logging "first") is being added FIRST to the event table, I would imagine it being the FIRST to be send to the event queue (also considering the event queue is a stak data structure) and the FIRST to be send to the call stack during the event loop.
I couldnt find anything about that online, also not reasons as to why it could be executed in random order and would be happy about an answer!
Also, on a side note: Could someone please explain when race conditions could be possible as if the above is not true, I cannot think of any case since javascript is single threaded.
Thanks!!
The specification states the following (emphasis added):
If method context is a Window object, wait until the Document associated with method context has been fully active for a further timeout milliseconds (not necessarily consecutively). Otherwise, method context is a WorkerGlobalScope object; wait until timeout milliseconds have passed with the worker not suspended (not necessarily consecutively).
Wait until any invocations of this algorithm that had the same method context, that started before this one, and whose timeout is equal to or less than this one’s, have completed.
As far as I understand this should ensure the execution to be in order, and you will always see first and then second.
Here is a picture of my web execution captured by Chrome Performance Devtools:
I notice that functions will be stopped many times during execution, when my web functions are stopped Chrome executes some RegExp operations (as shown in the picture). I don't understand what this is, and why it happens. Please help explain, thanks.
Update: here is a function which is also executed in the same manner:
What you describe
The way you describe the problem it sounds like you think the JavaScript virtual machine is putting the functions on hold (stopping them) while they are executing (i.e. before they return) to do something else and then resumes the functions.
The image you are showing does not suggest that at all to me.
What I'm seeing
The VM executes:
callback, which calls
some function whose name is hidden by a tooltip, which calls:
fireWith, which calls:
fire, which calls:
etc...
Then the deepest function returns, and the one that called it returns, and so on and so forth until fire returns, fireWith returns, the function whose name we cannot read returns, and callback returns.
Then the VM runs a RegExp function, and it calls a function named callback again, and the whole thing starts anew. In other words, the second column with callback and the rest is a new invocation of the functions. The functions are not "stop[ping] for a little time": they are called multiple times.
I see this all the time in libraries that respond to events. They typically will loop over event handlers to call them. Given a complex enough library, there'a fair amount of glue code that may sit between the loop that calls the handlers and your custom code so there's a lot of repetitive calls that show up in profiling.
Lets assume I run this piece of code.
var score = 0;
for (var i = 0; i < arbitrary_length; i++) {
async_task(i, function() { score++; }); // increment callback function
}
In theory I understand that this presents a data race and two threads trying to increment at the same time may result in a single increment, however, nodejs(and javascript) are known to be single threaded. Am I guaranteed that the final value of score will be equal to arbitrary_length?
Am I guaranteed that the final value of score will be equal to
arbitrary_length?
Yes, as long as all async_task() calls call the callback once and only once, you are guaranteed that the final value of score will be equal to arbitrary_length.
It is the single-threaded nature of Javascript that guarantees that there are never two pieces of Javascript running at the exact same time. Instead, because of the event driven nature of Javascript in both browsers and node.js, one piece of JS runs to completion, then the next event is pulled from the event queue and that triggers a callback which will also run to completion.
There is no such thing as interrupt driven Javascript (where some callback might interrupt some other piece of Javascript that is currently running). Everything is serialized through the event queue. This is an enormous simplification and prevents a lot of stickly situations that would otherwise be a lot of work to program safely when you have either multiple threads running concurrently or interrupt driven code.
There still are some concurrency issues to be concerned about, but they have more to do with shared state that multiple asynchronous callbacks can all access. While only one will ever be accessing it at any given time, it is still possible that a piece of code that contains several asynchronous operations could leave some state in an "in between" state while it was in the middle of several async operations at a point where some other async operation could run and could attempt to access that data.
You can read more about the event driven nature of Javascript here: How does JavaScript handle AJAX responses in the background? and that answer also contains a number of other references.
And another similar answer that discusses the kind of shared data race conditions that are possible: Can this code cause a race condition in socket io?
Some other references:
how do I prevent event handlers to handle multiple events at once in javascript?
Do I need to be concerned with race conditions with asynchronous Javascript?
JavaScript - When exactly does the call stack become "empty"?
Node.js server with multiple concurrent requests, how does it work?
To give you an idea of the concurrency issues that can happen in Javascript (even without threads and without interrupts, here's an example from my own code.
I have a Raspberry Pi node.js server that controls the attic fans in my house. Every 10 seconds it checks two temperature probes, one inside the attic and one outside the house and decides how it should control the fans (via relays). It also records temperature data that can be presented in charts. Once an hour, it saves the latest temperature data that was collected in memory to some files for persistence in case of power outage or server crash. That saving operation involves a series of async file writes. Each one of those async writes yields control back to the system and then continues when the async callback is called signaling completion. Because this is a low memory system and the data can potentially occupy a significant portion of the available RAM, the data is not copied in memory before writing (that's simply not practical). So, I'm writing the live in-memory data to disk.
At any time during any of these async file I/O operations, while waiting for a callback to signify completion of the many file writes involved, one of my timers in the server could fire, I'd collect a new set of temperature data and that would attempt to modify the in-memory data set that I'm in the middle of writing. That's a concurrency issue waiting to happen. If it changes the data while I've written part of it and am waiting for that write to finish before writing the rest, then the data that gets written can easily end up corrupted because I will have written out one part of the data, the data will have gotten modified from underneath me and then I will attempt to write out more data without realizing it's been changed. That's a concurrency issue.
I actually have a console.log() statement that explicitly logs when this concurrency issue occurs on my server (and is handled safely by my code). It happens once every few days on my server. I know it's there and it's real.
There are many ways to work around those types of concurrency issues. The simplest would have been to just make a copy in memory of all the data and then write out the copy. Because there are not threads or interrupts, making a copy in memory would be safe from concurrency (there would be no yielding to async operations in the middle of the copy to create a concurrency issue). But, that wasn't practical in this case. So, I implemented a queue. Whenever I start writing, I set a flag on the object that manages the data. Then, anytime the system wants to add or modify data in the stored data while that flag is set, those changes just go into a queue. The actual data is not touched while that flag is set. When the data has been safely written to disk, the flag is reset and the queued items are processed. Any concurrency issue was safely avoided.
So, this is an example of concurrency issues that you do have to be concerned about. One great simplifying assumption with Javascript is that a piece of Javascript will run to completion without any thread of getting interrupted as long as it doesn't purposely return control back to the system. That makes handling concurrency issues like described above lots, lots easier because your code will never be interrupted except when you consciously yield control back to the system. This is why we don't need mutexes and semaphores and other things like that in our own Javascript. We can use simple flags (just a regular Javascript variable) like I described above if needed.
In any entirely synchronous piece of Javascript, you will never be interrupted by other Javascript. A synchronous piece of Javascript will run to completion before the next event in the event queue is processed. This is what is meant by Javascript being an "event-driven" language. As an example of this, if you had this code:
console.log("A");
// schedule timer for 500 ms from now
setTimeout(function() {
console.log("B");
}, 500);
console.log("C");
// spin for 1000ms
var start = Date.now();
while(Data.now() - start < 1000) {}
console.log("D");
You would get the following in the console:
A
C
D
B
The timer event cannot be processed until the current piece of Javascript runs to completion, even though it was likely added to the event queue sooner than that. The way the JS interpreter works is that it runs the current JS until it returns control back to the system and then (and only then), it fetches the next event from the event queue and calls the callback associated with that event.
Here's the sequence of events under the covers.
This JS starts running.
console.log("A") is output.
A timer event is schedule for 500ms from now. The timer subsystem uses native code.
console.log("C") is output.
The code enters the spin loop.
At some point in time part-way through the spin loop the previously set timer is ready to fire. It is up to the interpreter implementation to decide exactly how this works, but the end result is that a timer event is inserted into the Javascript event queue.
The spin loop finishes.
console.log("D") is output.
This piece of Javascript finishes and returns control back to the system.
The Javascript interpreter sees that the current piece of Javascript is done so it checks the event queue to see if there are any pending events waiting to run. It finds the timer event and a callback associated with that event and calls that callback (starting a new block of JS execution). That code starts running and console.log("B") is output.
That setTimeout() callback finishes execution and the interpreter again checks the event queue to see if there are any other events that are ready to run.
Node uses an event loop. You can think of this as a queue. So we can assume, that your for loop puts the function() { score++; } callback arbitrary_length times on this queue. After that the js engine runs these one by one and increase score each time. So yes. The only exception if a callback is not called or the score variable is accessed from somewhere else.
Actually you can use this pattern to do tasks parallel, collect the results and call a single callback when every task is done.
var results = [];
for (var i = 0; i < arbitrary_length; i++) {
async_task(i, function(result) {
results.push(result);
if (results.length == arbitrary_length)
tasksDone(results);
});
}
No two invocations of the function can happen at the same time (b/c node is single threaded) so that will not be a problem. The only problem would be ifin some cases async_task(..) drops the callback. But if, e.g., 'async_task(..)' was just calling setTimeout(..) with the given function, then yes, each call will execute, they will never collide with each other, and 'score' will have the value expected, 'arbitrary_length', at the end.
Of course, the 'arbitrary_length' can't be so great as to exhaust memory, or overflow whatever collection is holding these callbacks. There is no threading issue however.
I do think it’s worth noting for others that view this, you have a common mistake in your code. For the variable i you either need to use let or reassign to another variable before passing it into the async_task(). The current implementation will result in each function getting the last value of i.
// synchronous Javascript
var result = db.get('select * from table1');
console.log('I am syncronous');
// asynchronous Javascript
db.get('select * from table1', function(result){
// do something with the result
});
console.log('I am asynchronous')
I know in synchronous code, console.log() executes after result is fetched from db, whereas in asynchronous code console.log() executes before the db.get() fetches the result.
Now my question is, how does the execution happen behind the scenes for asynchronous code and why is it non-blocking?
I have searched the Ecmascript 5 standard to understand how asynchronous code works but could not find the word asynchronous in the entire standard.
And from nodebeginner.org I also found out that we should not use a return statement as it blocks the event loop. But nodejs api and third party modules contain return statements everywhere. So when should a return statement be used and when shouldn't it?
Can somebody throw some light on this?
First of all, passing a function as a parameter is telling the function that you're calling that you would like it to call this function some time in the future. When exactly in the future it will get called depends upon the nature of what the function is doing.
If the function is doing some networking and the function is configured to be non-blocking or asychronous, then the function will execute, the networking operation will be started and the function you called will return right away and the rest of your inline javascript code after that function will execute. If you return a value from that function, it will return right away, long before the function you passed as a parameter has been called (the networking operation has not yet completed).
Meanwhile, the networking operation is going in the background. It's sending the request, listening for the response, then gathering the response. When the networking request has completed and the response has been collected, THEN and only then does the original function you called call the function you passed as a parameter. This may be only a few milliseconds later or it may be as long as minutes later - depending upon how long the networking operation took to complete.
What's important to understand is that in your example, the db.get() function call has long since completed and the code sequentially after it has also executed. What has not completed is the internal anonymous function that you passed as a parameter to that function. That's being held in a javascript function closure until later when the networking function finishes.
It's my opinion that one thing that confuses a lot of people is that the anonymous function is declared inside of your call to db.get and appears to be part of that and appears that when db.get() is done, this would be done too, but that is not the case. Perhaps that would look less like that if it was represented this way:
function getCompletionfunction(result) {
// do something with the result of db.get
}
// asynchronous Javascript
db.get('select * from table1', getCompletionFunction);
Then, maybe it would be more obvious that the db.get will return immediately and the getCompletionFunction will get called some time in the future. I'm not suggesting you write it this way, but just showing this form as a means of illustrating what is really happening.
Here's a sequence worth understanding:
console.log("a");
db.get('select * from table1', function(result){
console.log("b");
});
console.log("c");
What you would see in the debugger console is this:
a
c
b
"a" happens first. Then, db.get() starts its operation and then immediately returns. Thus, "c" happens next. Then, when the db.get() operation actually completes some time in the future, "b" happens.
For some reading on how async handling works in a browser, see How does JavaScript handle AJAX responses in the background?
jfriend00's answer explains asynchrony as it applies to most users quite well, but in your comment you seemed to want more details on the implementation:
[…] Can any body write some pseudo code, explaining the implementation part of the Ecmascript specification to achieve this kind of functionality? for better understanding the JS internals.
As you probably know, a function can stow away its argument into a global variable. Let's say we have a list of numbers and a function to add a number:
var numbers = [];
function addNumber(number) {
numbers.push(number);
}
If I add a few numbers, as long as I'm referring to the same numbers variable as before, I can access the numbers I added previously.
JavaScript implementations likely do something similar, except rather than stowing numbers away, they stow functions (specifically, callback functions) away.
The Event Loop
At the core of many applications is what's known as an event loop. It essentially looks like this:
loop forever:
get events, blocking if none exist
process events
Let's say you want to execute a database query like in your question:
db.get("select * from table", /* ... */);
In order to perform that database query, it will likely need to perform a network operation. Since network operations can take a significant amount of time, during which the processor is waiting, it makes sense to think that maybe we should, rather than waiting rather than doing some other work, just have it tell us when it's done so we can do other things in the mean time.
For simplicity's sake, I'll pretend that sending will never block/stall synchronously.
The functionality of get might look like this:
generate unique identifier for request
send off request (again, for simplicity, assuming this doesn't block)
stow away (identifier, callback) pair in a global dictionary/hash table variable
That's all get would do; it doesn't do any of the receiving bit, and it itself isn't responsible for calling your callback. That happens in the process events bit. The process events bit might look (partially) like this:
is the event a database response? if so:
parse the database response
look up the identifier in the response in the hash table to retrieve the callback
call the callback with the received response
Real Life
In real life, it's a little more complex, but the overall concept is not too different. If you want to send data, for example, you might have to wait until there's enough space in the operating system's outgoing network buffers before you can add your bit of data. When reading data, you might get it in multiple chunks. The process events bit probably isn't one big function, but itself just calling a bunch of callbacks (which then dispatch to more callbacks, and so on…)
While the implementation details between real life and our example are slightly different, the concept is the same: you kick off ‘doing something’, and a callback will be called through some mechanism or another when the work is done.