I'm going to illustrate my issue with this simple example.
What boggles my mind is why the server never gets created
and the socket never gets printed.
If I were to remove the while loop everything works.
What do I have to change to make the example below function?
const net = require('net');
net.createServer(socket => {
socket.setEncoding('utf-8');
console.log(socket);
}).listen(4242, '127.0.0.1');
console.log('do some while logic here')
while(true) { }
This happens because your socket creation is not an instant process. It needs to make system calls and so on. In other words, it is asynchronous. The way javascript works are that it has main loop and callback queue. Basically the main loop is what is executed and callback queue is the things that await to be executed (See MDN docs on this https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop).
What happens in your case is that your callback goes to the callback queue and waits to be executed, but it never gets to do so, because your main loop is blocked by while (true) {} loop. If you want nonblocking behavior you need to send things that are inside you while loop to callback queue instead. One of the ways to do it in javascript is to use setTimeout. E.g.
const net = require('net');
net.createServer(socket => {
socket.setEncoding('utf-8');
console.log(socket);
}).listen(4242, '127.0.0.1');
console.log('do some while logic here')
function main() {
// do something here
setTimeout(main);
}
main()
This way you're not going to have a stack overflow issue and you get nonblocking behavior in you while loop.
Nodejs is an event driven system that runs your Javascript single threaded. That means that in order for things to work properly, you cannot hog the entire CPU in a while() loop (or any other kind of loop) unless the loop directly contains an await statement that is awaiting an actual promise tied to an asynchronous operation.
This is a basic principle of programming in nodejs and you have to learn how to structure your program logic into the event driven world. You don't show what you're really trying to do, but "polling" anything in a tight loop is generally not the correct way to program an event driven system.
So, in the code you show here:
const net = require('net');
net.createServer(socket => {
socket.setEncoding('utf-8');
console.log(socket);
}).listen(4242, '127.0.0.1');
console.log('do some while logic here')
while(true) { }
Your while loop just spins forever and never allows any events to get processed and therefore your server can never get events about incoming connections. The events will just pile up in the event queue, but you never give nodejs a chance to go back to the event queue to process those events. To do so, you must finish what you're doing and return control back to the system (thus why you can't use the while(true) { } loop).
So, you really need to be thinking event-driven programming in nodejs. You set up event listeners and you execute code some time in the future when those events occur. You can artificially create events with setTimeout() or setInterval(), but doing that constantly or with really, really short time durations is just polling and is not an efficient way to program a nodejs server either.
If you show or describe for us what you're really trying to do in the rest of your code, we can advise the most important part of this question which is how to actually write that code in an event-driven fashion.
I repeat, learning how to program in an event-driven fashion is required for an efficient, scalable nodejs server process.
Related
I am learning Node.js and some javascript. I read up some stuff of thinks like queues and execution stacks.
I am trying to calculate time taken by a websocket request to complete. A very typical emit is of form:
microtime1 = getTimeNow;
socket.emit("message","data", function (error, data) {
// calculate time taken by using microtime module by getting updated time and calculating difference.
microtime2 = getTimeNow;
time = microtime2 - microtime1;
})
If I am sending multiple messages, can I rely on callback getting executed without delay or can there be a hold up in the queue and callback won't get executed.
In other words, would callback only get called once it's in stack or does it get executed while it's waiting to be picked up in the queue ?
Hope, I was able to explain my question.
In other words, would callback only get called once it's in stack or does it get executed while it's waiting to be picked up in the queue ?
The callback gets executed after the event, that it is waiting for, is resolved.
So the callback should work just fine, however there is a caveat. because node-js is single threaded, you could have another process that's blocked the main thread.
For example the simple view of execution may look like this. One event is processed, and then another one is processed after.
However, in reality it may look more like this
The single thread is meant for the main thread only, things like the IO operations are done on another dedicated thread that will notify main thread when it's done, and then the callback can be executed after
The problem occurs if your main thread becomes busy while waiting for the network action to complete
This is hard to predict though and depends on what the rest of the app is doing. If your app is not doing anything else, this likely won't be an issue. But, IMHO, a better way is to make hundreds or thousands of calls and allow get an average which will account for other possible causes for discrepancies in the delta.
Additional data from c-sharpcorner.com
The above diagram shows the
execution process of Node.js. Let's understand it step by step.
Step 1
Whenever a request comes to Node.js API, that incoming request is
added to the event queue. This is because Node.js can't handle
multiple requests simultaneously. The first time, the incoming request
is added to the event queue.
Step 2
Now, you can see in the diagram that one loop is there which always
checks if any event or request is available in event queue or not. If
any requests are there, then according to the "First Come, First
Served" property of queue, the requests will be served.
Step 3
This Node.js event loop is single threaded and performs non blocking
i/o tasks, so it sends requests to C++ internal thread pool where lots
of threads can be run. This C++ internal thread pool is the part of
event loop developed in Libuv. This can handle multiple requests. Now,
event loop checks again and again if any event is there in the Event
Queue. If there is any, then it serves to the thread pool if the
blocking process is there.
Step 4
Now, the internal thread pool handles a lot of requests, like database
request, file request, and many more.
Step 5
Whenever any thread completes that task, the callback function calls
and sends the response back to the event loop.
Step 6
Now, event loop sends back the response to the client whose request is
completed.
Reading through the NodeJS Event Loop description I wonder how setTimeout and setInterval can actually work.
The page says NodeJS first runs the given script (let REPL alone for now) and then enters the event loop. But what if I call setTimeout in that script and expect it to trigger while the script is still running? Isn't that the normal case actually? According to the description the timer callback will not be triggered before the main script ends, which sounds really weird to me.
For those interested, here's the NodeJS outer even loop (there are actually 2 nested loops): https://github.com/nodejs/node/blob/master/src/node.cc#L4526
let's do this by example
setTimeout(function(){
print('there');
});
print('hi');
this will print hi then there
here's what happen
the script will be proccessed until last line and when ever it finds a timer function
it will add it to a queue which will be handled later at the end of the execution by the queue scheduler
loop queue => [ setTimeout ]
before exit there should be a scheduler, some kind of a loop to check if we
have something in the queue and handle them, then once queue is out of all timers the loop
will exit.
let's suppose we call setTimeout inside setInterval
setInterval(function(){
setTimeout(function(){
print('hi')
}, 500);
}, 1000);
loop queue => [ setInterval ]
after 1000 ms
setInterval will be fired and the inner setTimeout will be added to the queue
loop queue => [ setTimeout, setInterval ]
now we get back to the main loop which will wait for another 500 ms
an fire the inner setTimeout function, then remove it from the queue
because setTimeout should be run once.
loop queue => [ setInterval ]
back to the main loop, we still have items in the queue, so it will wait
another 500 ms and fire again ( 500 + 500 = 1000 ms)
the inner setTimeout function will be added to the queue again
loop queue => [ setTimeout, setInterval ]
back to the main queue agin and again ...
Now this is simply how timers work, they are not meant to handle blocking code, it's
a way to run code at some intervals
setInterval(function(){
// do something long running here
while (1) {}
setTimeout(function(){
print('hi')
}, 500);
}, 1000);
main loop will block here and the inner timeout will not be added to the queue, so this
is a bad idea
nodejs and event loop in general are good with network operations because they don't block when
used with select for example.
setInterval(function(){
// check if socket has something
if (socketHasData( socket )){
processSocketData( socket );
}
// do something else that does not block
// maybe schedule another timer here
print('hello');
}, 1000);
libuv which is the event loop used in nodejs, uses threads to handle some
blocking operations like IO operations, open/read/write files
[EDIT] humm re-reading your initial post, I think I know what bugs you. You mentioned nodejs in your post, implying you might be coding a server.
If you are not really familiar with server side JavaScript and more used to php server for example it might be very confusing indeed.
With a php server, a request creates a new thread that will handle it and when the main script (as you call it) ends, then the thread is killed and nothing else runs on the server (except for the webserver that listens to request, like nginx or apache).
With nodejs, it's different. The main thread is alone and always running. So when a request arrives, callbacks are fired but they are still in that single thread. Said otherwise: the main script never ends (except when you kill it or that your server crashes :) )
Well, that is accurate. Because of the single-threaded nature of JavaScript, if a timer ends while the main thread is busy, the timer's callback will wait.
When you do
setTimeout(callback, 1000)
You are not saying "I want this callback to be called in exactly 1s" but actually "I want this callback to be called in, at least, 1s"
This article by John Resig is an excellent read and goes through the details of the JavaScript's timers https://johnresig.com/blog/how-javascript-timers-work/
But what if I call setTimeout in that script and expect it to trigger while the script is still running?
You don't expect that. You expect your synchronous code run to completion way before the timeout occurs.
If the script is still running, because it's doing something blocking - it hangs - then the timeout callback doesn't get a chance to execute, it will wait. That's exactly why we need to write non-blocking code.
Isn't that the normal case actually?
No. Most of the time no JS is executing, the event loop is idling (while there might be background tasks doing the heavy lifting).
Given that Node is single threaded, it (v8 engine) always executes the current script before moving on to the next one. So when we start a node server with a main script, it loads, parses, compiles and executes that script first, before it runs anything else. Only if the current running script hits an I/O call it gets bumped out to the back of the event loop, giving other scripts or setTimeout callbacks a chance to execute. This is the very nature of JavaScript engine and the reason Node is not considered good for long running, in-memory CPU intensive tasks.
As #atomrc said in his answer, setTimeout and setInterval are just a hint to node to run the callbacks after the timeout period, there are no guarantees.
Lets assume I run this piece of code.
var score = 0;
for (var i = 0; i < arbitrary_length; i++) {
async_task(i, function() { score++; }); // increment callback function
}
In theory I understand that this presents a data race and two threads trying to increment at the same time may result in a single increment, however, nodejs(and javascript) are known to be single threaded. Am I guaranteed that the final value of score will be equal to arbitrary_length?
Am I guaranteed that the final value of score will be equal to
arbitrary_length?
Yes, as long as all async_task() calls call the callback once and only once, you are guaranteed that the final value of score will be equal to arbitrary_length.
It is the single-threaded nature of Javascript that guarantees that there are never two pieces of Javascript running at the exact same time. Instead, because of the event driven nature of Javascript in both browsers and node.js, one piece of JS runs to completion, then the next event is pulled from the event queue and that triggers a callback which will also run to completion.
There is no such thing as interrupt driven Javascript (where some callback might interrupt some other piece of Javascript that is currently running). Everything is serialized through the event queue. This is an enormous simplification and prevents a lot of stickly situations that would otherwise be a lot of work to program safely when you have either multiple threads running concurrently or interrupt driven code.
There still are some concurrency issues to be concerned about, but they have more to do with shared state that multiple asynchronous callbacks can all access. While only one will ever be accessing it at any given time, it is still possible that a piece of code that contains several asynchronous operations could leave some state in an "in between" state while it was in the middle of several async operations at a point where some other async operation could run and could attempt to access that data.
You can read more about the event driven nature of Javascript here: How does JavaScript handle AJAX responses in the background? and that answer also contains a number of other references.
And another similar answer that discusses the kind of shared data race conditions that are possible: Can this code cause a race condition in socket io?
Some other references:
how do I prevent event handlers to handle multiple events at once in javascript?
Do I need to be concerned with race conditions with asynchronous Javascript?
JavaScript - When exactly does the call stack become "empty"?
Node.js server with multiple concurrent requests, how does it work?
To give you an idea of the concurrency issues that can happen in Javascript (even without threads and without interrupts, here's an example from my own code.
I have a Raspberry Pi node.js server that controls the attic fans in my house. Every 10 seconds it checks two temperature probes, one inside the attic and one outside the house and decides how it should control the fans (via relays). It also records temperature data that can be presented in charts. Once an hour, it saves the latest temperature data that was collected in memory to some files for persistence in case of power outage or server crash. That saving operation involves a series of async file writes. Each one of those async writes yields control back to the system and then continues when the async callback is called signaling completion. Because this is a low memory system and the data can potentially occupy a significant portion of the available RAM, the data is not copied in memory before writing (that's simply not practical). So, I'm writing the live in-memory data to disk.
At any time during any of these async file I/O operations, while waiting for a callback to signify completion of the many file writes involved, one of my timers in the server could fire, I'd collect a new set of temperature data and that would attempt to modify the in-memory data set that I'm in the middle of writing. That's a concurrency issue waiting to happen. If it changes the data while I've written part of it and am waiting for that write to finish before writing the rest, then the data that gets written can easily end up corrupted because I will have written out one part of the data, the data will have gotten modified from underneath me and then I will attempt to write out more data without realizing it's been changed. That's a concurrency issue.
I actually have a console.log() statement that explicitly logs when this concurrency issue occurs on my server (and is handled safely by my code). It happens once every few days on my server. I know it's there and it's real.
There are many ways to work around those types of concurrency issues. The simplest would have been to just make a copy in memory of all the data and then write out the copy. Because there are not threads or interrupts, making a copy in memory would be safe from concurrency (there would be no yielding to async operations in the middle of the copy to create a concurrency issue). But, that wasn't practical in this case. So, I implemented a queue. Whenever I start writing, I set a flag on the object that manages the data. Then, anytime the system wants to add or modify data in the stored data while that flag is set, those changes just go into a queue. The actual data is not touched while that flag is set. When the data has been safely written to disk, the flag is reset and the queued items are processed. Any concurrency issue was safely avoided.
So, this is an example of concurrency issues that you do have to be concerned about. One great simplifying assumption with Javascript is that a piece of Javascript will run to completion without any thread of getting interrupted as long as it doesn't purposely return control back to the system. That makes handling concurrency issues like described above lots, lots easier because your code will never be interrupted except when you consciously yield control back to the system. This is why we don't need mutexes and semaphores and other things like that in our own Javascript. We can use simple flags (just a regular Javascript variable) like I described above if needed.
In any entirely synchronous piece of Javascript, you will never be interrupted by other Javascript. A synchronous piece of Javascript will run to completion before the next event in the event queue is processed. This is what is meant by Javascript being an "event-driven" language. As an example of this, if you had this code:
console.log("A");
// schedule timer for 500 ms from now
setTimeout(function() {
console.log("B");
}, 500);
console.log("C");
// spin for 1000ms
var start = Date.now();
while(Data.now() - start < 1000) {}
console.log("D");
You would get the following in the console:
A
C
D
B
The timer event cannot be processed until the current piece of Javascript runs to completion, even though it was likely added to the event queue sooner than that. The way the JS interpreter works is that it runs the current JS until it returns control back to the system and then (and only then), it fetches the next event from the event queue and calls the callback associated with that event.
Here's the sequence of events under the covers.
This JS starts running.
console.log("A") is output.
A timer event is schedule for 500ms from now. The timer subsystem uses native code.
console.log("C") is output.
The code enters the spin loop.
At some point in time part-way through the spin loop the previously set timer is ready to fire. It is up to the interpreter implementation to decide exactly how this works, but the end result is that a timer event is inserted into the Javascript event queue.
The spin loop finishes.
console.log("D") is output.
This piece of Javascript finishes and returns control back to the system.
The Javascript interpreter sees that the current piece of Javascript is done so it checks the event queue to see if there are any pending events waiting to run. It finds the timer event and a callback associated with that event and calls that callback (starting a new block of JS execution). That code starts running and console.log("B") is output.
That setTimeout() callback finishes execution and the interpreter again checks the event queue to see if there are any other events that are ready to run.
Node uses an event loop. You can think of this as a queue. So we can assume, that your for loop puts the function() { score++; } callback arbitrary_length times on this queue. After that the js engine runs these one by one and increase score each time. So yes. The only exception if a callback is not called or the score variable is accessed from somewhere else.
Actually you can use this pattern to do tasks parallel, collect the results and call a single callback when every task is done.
var results = [];
for (var i = 0; i < arbitrary_length; i++) {
async_task(i, function(result) {
results.push(result);
if (results.length == arbitrary_length)
tasksDone(results);
});
}
No two invocations of the function can happen at the same time (b/c node is single threaded) so that will not be a problem. The only problem would be ifin some cases async_task(..) drops the callback. But if, e.g., 'async_task(..)' was just calling setTimeout(..) with the given function, then yes, each call will execute, they will never collide with each other, and 'score' will have the value expected, 'arbitrary_length', at the end.
Of course, the 'arbitrary_length' can't be so great as to exhaust memory, or overflow whatever collection is holding these callbacks. There is no threading issue however.
I do think it’s worth noting for others that view this, you have a common mistake in your code. For the variable i you either need to use let or reassign to another variable before passing it into the async_task(). The current implementation will result in each function getting the last value of i.
I am thinking about it and this is what I came up with:
Let's see this code below:
console.clear();
console.log("a");
setTimeout(function(){console.log("b");},1000);
console.log("c");
setTimeout(function(){console.log("d");},0);
A request comes in, and JS engine starts executing the code above step by step. The first two calls are sync calls. But when it comes to setTimeout method, it becomes an async execution. But JS immediately returns from it and continue executing, which is called Non-Blocking or Async. And it continues working on other etc.
The results of this execution is the following:
a c d b
So basically the second setTimeout got finished first and its callback function gets executed earlier than the first one and that makes sense.
We are talking about single-threaded application here. JS Engine keeps executing this and unless it finishes the first request, it won't go to second one. But the good thing is that it won't wait for blocking operations like setTimeout to resolve so it will be faster because it accepts the new incoming requests.
But my questions arise around the following items:
#1: If we are talking about a single-threaded application, then what mechanism processes setTimeouts while the JS engine accepts more requests and executes them? How does the single thread continue working on other requests? What works on setTimeout while other requests keep coming in and get executed.
#2: If these setTimeout functions get executed behind the scenes while more requests are coming in and being executed, what carries out the async executions behind the scenes? What is this thing that we talk about called the EventLoop?
#3: But shouldn't the whole method be put in the EventLoop so that the whole thing gets executed and the callback method gets called? This is what I understand when talking about callback functions:
function downloadFile(filePath, callback)
{
blah.downloadFile(filePath);
callback();
}
But in this case, how does the JS Engine know if it is an async function so that it can put the callback in the EventLoop? Perhaps something like the async keyword in C# or some sort of an attribute which indicates the method JS Engine will take on is an async method and should be treated accordingly.
#4: But an article says quite contrary to what I was guessing on how things might be working:
The Event Loop is a queue of callback functions. When an async
function executes, the callback function is pushed into the queue. The
JavaScript engine doesn't start processing the event loop until the
code after an async function has executed.
#5: And there is this image here which might be helpful but the first explanation in the image is saying exactly the same thing mentioned in question number 4:
So my question here is to get some clarifications about the items listed above?
1: If we are talking about a single-threaded application, then what processes setTimeouts while JS engine accepts more requests and executes them? Isn't that single thread will continue working on other requests? Then who is going to keep working on setTimeout while other requests keep coming and get executed.
There's only 1 thread in the node process that will actually execute your program's JavaScript. However, within node itself, there are actually several threads handling operation of the event loop mechanism, and this includes a pool of IO threads and a handful of others. The key is the number of these threads does not correspond to the number of concurrent connections being handled like they would in a thread-per-connection concurrency model.
Now about "executing setTimeouts", when you invoke setTimeout, all node does is basically update a data structure of functions to be executed at a time in the future. It basically has a bunch of queues of stuff that needs doing and every "tick" of the event loop it selects one, removes it from the queue, and runs it.
A key thing to understand is that node relies on the OS for most of the heavy lifting. So incoming network requests are actually tracked by the OS itself and when node is ready to handle one it just uses a system call to ask the OS for a network request with data ready to be processed. So much of the IO "work" node does is either "Hey OS, got a network connection with data ready to read?" or "Hey OS, any of my outstanding filesystem calls have data ready?". Based upon its internal algorithm and event loop engine design, node will select one "tick" of JavaScript to execute, run it, then repeat the process all over again. That's what is meant by the event loop. Node is basically at all times determining "what's the next little bit of JavaScript I should run?", then running it. This factors in which IO the OS has completed, and things that have been queued up in JavaScript via calls to setTimeout or process.nextTick.
2: If these setTimeout will get executed behind the scenes while more requests are coming and in and being executed, the thing carry out the async executions behind the scenes is that the one we are talking about EventLoop?
No JavaScript gets executed behind the scenes. All the JavaScript in your program runs front and center, one at a time. What happens behind the scenes is the OS handles IO and node waits for that to be ready and node manages its queue of javascript waiting to execute.
3: How can JS Engine know if it is an async function so that it can put it in the EventLoop?
There is a fixed set of functions in node core that are async because they make system calls and node knows which these are because they have to call the OS or C++. Basically all network and filesystem IO as well as child process interactions will be asynchronous and the ONLY way JavaScript can get node to run something asynchronously is by invoking one of the async functions provided by the node core library. Even if you are using an npm package that defines it's own API, in order to yield the event loop, eventually that npm package's code will call one of node core's async functions and that's when node knows the tick is complete and it can start the event loop algorithm again.
4 The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.
Yes, this is true, but it's misleading. The key thing is the normal pattern is:
//Let's say this code is running in tick 1
fs.readFile("/home/barney/colors.txt", function (error, data) {
//The code inside this callback function will absolutely NOT run in tick 1
//It will run in some tick >= 2
});
//This code will absolutely also run in tick 1
//HOWEVER, typically there's not much else to do here,
//so at some point soon after queueing up some async IO, this tick
//will have nothing useful to do so it will just end because the IO result
//is necessary before anything useful can be done
So yes, you could totally block the event loop by just counting Fibonacci numbers synchronously all in memory all in the same tick, and yes that would totally freeze up your program. It's cooperative concurrency. Every tick of JavaScript must yield the event loop within some reasonable amount of time or the overall architecture fails.
Don't think the host process to be single-threaded, they are not. What is single-threaded is the portion of the host process that execute your javascript code.
Except for background workers, but these complicate the scenario...
So, all your js code run in the same thread, and there's no possibility that you get two different portions of your js code to run concurrently (so, you get not concurrency nigthmare to manage).
The js code that is executing is the last code that the host process picked up from the event loop.
In your code you can basically do two things: run synchronous instructions, and schedule functions to be executed in future, when some events happens.
Here is my mental representation (beware: it's just that, I don't know the browser implementation details!) of your example code:
console.clear(); //exec sync
console.log("a"); //exec sync
setTimeout( //schedule inAWhile to be executed at now +1 s
function inAWhile(){
console.log("b");
},1000);
console.log("c"); //exec sync
setTimeout(
function justNow(){ //schedule justNow to be executed just now
console.log("d");
},0);
While your code is running, another thread in the host process keep track of all system events that are occurring (clicks on UI, files read, networks packets received etc.)
When your code completes, it is removed from the event loop, and the host process return to checking it, to see if there are more code to run. The event loop contains two event handler more: one to be executed now (the justNow function), and another within a second (the inAWhile function).
The host process now try to match all events happened to see if there handlers registered for them.
It found that the event that justNow is waiting for has happened, so it start to run its code. When justNow function exit, it check the event loop another time, searhcing for handlers on events. Supposing that 1 s has passed, it run the inAWhile function, and so on....
The Event Loop has one simple job - to monitor the Call Stack, the Callback Queue and Micro task queue. If the Call Stack is empty, the Event Loop will take the first event from the micro task queue then from the callback queue and will push it to the Call Stack, which effectively runs it. Such an iteration is called a tick in the Event Loop.
As most developers know, that Javascript is single threaded, means two statements in javascript can not be executed in parallel which is correct. Execution happens line by line, which means each javascript statements are synchronous and blocking. But there is a way to run your code asynchronously, if you use setTimeout() function, a Web API given by the browser, which makes sure that your code executes after specified time (in millisecond).
Example:
console.log("Start");
setTimeout(function cbT(){
console.log("Set time out");
},5000);
fetch("http://developerstips.com/").then(function cbF(){
console.log("Call back from developerstips");
});
// Millions of line code
// for example it will take 10000 millisecond to execute
console.log("End");
setTimeout takes a callback function as first parameter, and time in millisecond as second parameter.
After the execution of above statement in browser console it will print
Start
End
Call back from developerstips
Set time out
Note: Your asynchronous code runs after all the synchronous code is done executing.
Understand How the code execution line by line
JS engine execute the 1st line and will print "Start" in console
In the 2nd line it sees the setTimeout function named cbT, and JS engine pushes the cbT function to callBack queue.
After this the pointer will directly jump to line no.7 and there it will see promise and JS engine push the cbF function to microtask queue.
Then it will execute Millions of line code and end it will print "End"
After the main thread end of execution the event loop will first check the micro task queue and then call back queue. In our case it takes cbF function from the micro task queue and pushes it into the call stack then it will pick cbT funcion from the call back queue and push into the call stack.
JavaScript is high-level, single-threaded language, interpreted language. This means that it needs an interpreter which converts the JS code to a machine code. interpreter means engine. V8 engines for chrome and webkit for safari. Every engine contains memory, call stack, event loop, timer, web API, events, etc.
Event loop: microtasks and macrotasks
The event loop concept is very simple. There’s an endless loop, where the JavaScript engine waits for tasks, executes them and then sleeps, waiting for more tasks
Tasks are set – the engine handles them – then waits for more tasks (while sleeping and consuming close to zero CPU). It may happen that a task comes while the engine is busy, then it’s enqueued. The tasks form a queue, so-called “macrotask queue”
Microtasks come solely from our code. They are usually created by promises: an execution of .then/catch/finally handler becomes a microtask. Microtasks are used “under the cover” of await as well, as it’s another form of promise handling. Immediately after every macrotask, the engine executes all tasks from microtask queue, prior to running any other macrotasks or rendering or anything else.
I've been trying to understand some code used to open a websocket:
var ws = new WebSocket('ws://my.domain.com');
ws.onopen = function(event) {
...
}
My question is how does the handshaking get started? If it is started in the WebSocket constructor, then how does onopen get called if it isn't set by then? If the WebSocket constructor creates a thread that does the handshaking, then does onopen have to be defined quickly enough before the handshaking is over? If so, that sounds a little dangerous because if the JS virtual machine is slowed the handshaking could be finished before onopen is defined, which means that the event is not handled. Or does setting the onopen function trigger the handshaking?
Could someone explain to me the mechanics of the API please?
It does not look for onopen function until end of execution of current (synchronous) code. That is because the connection (and thus calling onopen callback) is asynchronous.
Consider:
let x = false;
setTimeout(function () {
x = true
}, 1000);
while(!x){
console.log('waiting!');
}
The while loop there will never end but you would probably suspect it'd end after one second.
If you delay the initialisation of onopen function by executing time-consuming (but synchronous) code then it is not dangerous. On the other if you setTimeout initialisation of onopen then there's no guarantee whether it's defined or not at the time the WebSockets connection is ready as you can't be sure which callback will be executed first.
If you were doing the same thing in C++ you'd use threads for that. In JavaScript callbacks mechanism is not thread-based; it just behaves thread-like (see the endless while loop above).
Single thread executes one code-unit at a time and other code units
are queued until the current code unit is finished executing
source: http://www.slideshare.net/clutchski/writing-asynchronous-javascript-101
It's important to understand that even if you setTimeout something for 1s it might not execute after one second - If the thread is busy it might never get executed.
Thus if you initiate WebSocket connection and run a loop similar to the one above but waiting for the connection to be ready it might never end.
This behaviour might look strange for programmers not familiar with JS. Therefore for readability I define callbacks at the same time or immediately after the functions which need them whenever it's possible.
If you want to explicitly use threads and concurrent execution, read more about Web Workers
Reference:
How JavaScript Timers Work
Understanding JavaScript timers
You don't need any setTimeout function. I'm using a library for this and my code looks something like this:
var pushstream = new PushStream({
host: window.location.hostname,
port: window.location.port,
modes: "websocket"
});
pushstream.onmessage = _manageEvent;
function _manageEvent(eventMessage) {
console.log(eventMessage);
}
This gave me a hell of an insight on websockets and how to implement a client in Javascript: https://github.com/wandenberg/nginx-push-stream-module/blob/master/misc/js/pushstream.js
And also the server: https://github.com/wandenberg/nginx-push-stream-module/
It's very well documented I hope it helps :)