Which code will run on the browser's main thread? - javascript

Chrome was the last of the big trio (IE, Firefox, Chrome) to deprecate running synchronous XMLHttpRequest calls on the "main thread" (as Firefox calls it). Some browsers have also completely removed the ability of setting the .widthCredentials option for synchronous requests on the main thread.
After searching far and wide, I couldn't find enough information to precisely identify which code will run on the main thread, and which will not.
It is obvious that javascript included via script tag (inline or with src) is on the main thread.
And a synchronous XHR which runs inside the callback of an asynchronous XHR would not be running on the main thread.
But how about other scenarios? Mouse events, touch events, various document events? How to tell without trying everything? It would be nice to avoid making everything asynchronous and a callback hell.
Please attempt a thorough answer.
Edit:
W3C spec warning:
Developers must not pass false for the async argument when the JavaScript global environment is a document environment as it has detrimental effects to the end user's experience. User agents are strongly encouraged to warn about such usage in developer tools and may experiment with throwing an "InvalidAccessError" exception when it occurs so the feature can eventually be removed from the platform.
Edit 2:
Clarification:
There are situations where calling code must either wait for all racing simultaneous async calls to finish (using some counters or state tracking variabiles for each call), or have them chained using callbacks. Each situation sucks. For example, I have a JSONRPC client which needs to dynamically create callable functions by interrogating a reflection API.
It is over the hand to have all implementing code (UI, or NOT) run inside the callback of yet another a library, especially if it has to be done on multiple pages, and if the library has to behave as a simple definition (hide that is running code at define time). This is just an example of complexity, I am not asking for a solution to it, but a general clear explanation of how browsers decide which is the main thread.

As you have cited the W3C spec, it's easy to explain what you are hunting after:
Developers must not pass false for the async argument when the
JavaScript global environment is a document environment as it has
detrimental effects to the end user's experience.
What they mean document environment is explained in the processing models:
This specification describes three kinds of JavaScript global
environments: the document environment, the dedicated worker
environment, and the shared worker environment. The dedicated worker
environment and the shared worker environment are both types of worker
environments.
Except where otherwise specified, a JavaScript global environment is a
document environment.
A "document environment" is therefore the global JavaScript environment of a page, i.e. the window that you see. Every JS global environment is single-threaded. Everything (really everything, you considered: Mouse events, touch events, various document events) runs in this environment. This is probably what Gecko considers a "main thread".
It would be nice to avoid making everything asynchronous and a callback hell
Making something asynchronous doesn't shift work from the main thread. It just defers it, making it possible for other events to run while you are waiting. If there is an asynchronous api for what you want to do (i.e. does processing in the background), use it. Make everything asynchronous.
There are enough techniques (e.g. promises!) to avoid a callback hell, which is just a sign of bad code.
Shifting work off the "main thread" requires you to create a new environment - a web worker. In that, you can do as many synchronous XMLHttpRequests as you want without being disturbed.

Each browser is free to implement its own threading model as it sees fit. Different implementations will handle threading differently.
It is safe to say that if you are to block execution with JavaScript that you are doing something you shouldn't be. Even if you don't hang up the UI, browsers these days will prompt the user to abort your script. If you stay within a reasonable amount of blocking processing in your script, this isn't an issue. Synchronous XHR is something you should never do, as it isn't necessary and the time for which the thread will block is unpredictable.

Related

Redis caching in nodejs

So I was looking at this module, and I cannot understand why it uses callbacks.
I thought that memory caching is supposed to be fast and that is also the purpose someone would use caching, because it's fast... like instant.
Adding callbacks implies that you need to wait for something.
But how much you need to wait actually? If the result gets back to you very fast, aren't you slowing things down by wrapping everything in callbacks + promises on top (because as a user of this module you are forced to promisify those callbacks) ?
By design, javascript is asynchronous for most of its external calls (http, 3rd parties libraries, ...).
As mention here
Javascript is a single threaded language. This means it has one call stack and one memory heap. As expected, it executes code in order and must finish executing a piece code before moving onto the next. It's synchronous, but at times that can be harmful. For example, if a function takes a while to execute or has to wait on something, it freezes everything up in the meanwhile.
Having synchronous function will block the thread and the execution of the script. To avoid any blocking (due to networking, file access, etc...), it is recommended to get these information asynchronously.
Most of the time, the redis caching will take a few ms. However, this is preventing a possible network lag and will keep your application up and running with a tiny amount of connectivity errors for your customers.
TLDR: reading from memory is fast, reading from network is slower and shouldn't block the process
You're right. Reading from memory cache is very fast. It's as fast as accessing any variable (micro or nano seconds), so there is no good reason to implement this as a callback/promise as it will be significantly slower. But this is only true if you're reading from the nodejs process memory.
The problem with using redis from node, is that the memory cache is stored on another machine (redis server) or at least another process. So the even if redis reads the data very quickly, it still has to go through the network to return to your node server, which isn't always guaranteed to be fast (usually few milliseconds at least). For example, if you're using a redis server which is not physically close to your nodejs server, or you have too many network requests, ... the request can take longer to reach redis and return back to your server. Now imagine if this was blocking by default, it would prevent your server from doing anything else until the request is complete. Which will result in a very poor performance as your server is sitting idle waiting for the network trip. That's the reason why any I/O (disk, network, ..) operation in nodejs should be async by default.
Alex, you remarked with "I thought that memory caching is supposed to be fast and that is also the purpose someone would use caching, because it's fast... like instant." And you're near being wholly right.
Now, what does Redis actually mean?
It means REmote DIctionary Server.
~ FAQ - Redis
Yes, a dictionary usually performs in O(1) time. However, do note that the perception of the said performance is effective from the facade of procedures running inside the process holding the dictionary. Therefore, access to the memory owned by the Redis process from another process, is a channel of operations that is not O(1).
So, because Redis is a REmote DIctionary Server asynchronous APIs are required to access its service.
As it has already been answered here, your redis instance could be on your machine (and accessing redis RAM storage is nearly as fast as accessing a regular javascript variable) but it could also be an another machine/cloud/cluster/you name it. And in that case, network latency could be problematic, that's why the promises/callbacks syntax.
If you are 100% confident that your Redis instance would always lay on the same machine your code is, that having some blocking asynchronous calls to it is fine, you could just use the ES6 await syntax to write it as blocking synchronous events and avoid the callbacks or the promises :)
But I'm not sure it is worth it, in term of coding habit and scalability. But every project is different and that could suits you.

How does Node.js use fewer threads to handle multiple connections?

I've got no problem with events and callbacks, synchrony/asynchrony, the call stack and the queue.
However, as I understand it, other servers make a new thread for each connection which contain both the blocking request and handler for the response of that request where as in node this handler would be passed to the main thread as a callback. The ability of this kind server to handle multiple requests is therefore limited by it's ability to create and switch between multiple threads.
When Node receives a blocking request it sends it into asynchrony land while it carries on processing the main thread. What happens in asynchrony land, doesn't a thread still need to be created to await the response for that request and then to sent the event to node event loop? If so, why isn't Node as limited by the server's ability to create and switch between threads? If not, what happens to the request?
I think there's some confusion over how the event loop actually works. NodeJS doesn't "receive a blocking request" and "send it into asynchrony land". It's asynchronous to begin with - unless you call a ...Sync() pattern function, EVERY call and EVERY operation is async. Confusingly, once you are inside your CODE, EVERY operation is synchronous.
It's a "cooperative multitasking" approach - all calls to the system are expected to "start the ball rolling" and return immediately, while your own code is suppose to do what it needs to do as quickly as possible and yield control back to the JSVM (by returning from your function).
To understand how this works when you're dealing with network communications, you need to go back in time to before threads really even existed. In the early days, if you had multiple network connections, your single-threaded process would have to put together a list of all the sockets it wanted information on (such as "has data arrived for me to read?"), and ask the OS if that was true by calling select(). This would a yes/no for each socket for each question. This was typically done in a while() loop that ran until the program was terminated. You would ask for a list of sockets with new data, read that data, do something with it, and then go back to sleep, over and over again.
NodeJS is far more sophisticated but this analogy works well for it. It has a main "event loop" that is constantly sleeping until there is work to do, then waking up and doing it.
Everything that you do comes from, or goes into, this channel. If you write data to a network socket, and ask to be notified (called back) when it's done, NodeJS passes your request to the operating system and then goes to sleep. You stop running. Your context is saved - all your local vars are saved. When the OS comes back and says "done!", NodeJS checks its list and sees you wanted to know about this, and calls your function, reloading your context so all your local vars are where you need them.
To be very clear, it is entirely possible that when the data is finished being written to the network, and the OS notification comes back for that, NodeJS is busy with other work! NodeJS won't "create a thread" to handle it - it'll ignore it completely until it gets some free time! It won't be lost... it just won't be handled "yet".
This drives programmers used to threading models nuts - it seems illogical that this constant state of never immediately responding to an incoming event "until it has a chance" could possibly be efficient. But software architectures are often deceiving. Threading models actually have fairly high overhead. CPU core counts aren't infinite - the entire computer as a whole is doing plenty of work all the time. Threads aren't free - just because you make one doesn't mean the CPU itself has time to do anything with it. And the overhead of thread creation and management often means an efficiency loss.
Old-school event-loop models eliminate this overhead. When things go badly like you have an infinite loop in your code, they can behave very badly - often locking up completely. But when things are going well they can actually be a lot faster, and many benchmarks have shown that well-written NodeJS modules can perform as well as or even better than similar modules in other languages.
In summary, the most common confusion in NodeJS is what "async" really means. A good way to think of it is that in threading models, programmers are expected to be "bad"/simplistic (write blocking code and just wait for things to return) and the core VM or OS is expected to be "good"/smart (tolerate this by making threads to handle async work). In NodeJS, programmers are expected to be "good"/sophisticated (write well-structured async code), allowing the JSVM to focus on what it does best and not need as much magic to make things work well. Well-used, NodeJS puts a lot of power in your hands.

How does multi-threading or async code in JavaScript work?

I am no beginner in javascript. I am actually working on this for past 3-4 months but today I read this statement about "What is JavaScript?"
JavaScript is single-threaded, non-blocking, asynchronous, concurrent language.
and I was lost. If JavaScript is single-threaded, how can it be concurrent and how can it be asynchronous because you need to keep track what your async code is doing and without another thread, it is impossible to track 2 or more code at a same time?
Ah.. here's the thing:
JavaScript is single threaded, but it has a lot of spare time on its hands.
When it is waiting for something to load out off the network, or its waiting for something off disk or waiting for the OS to hand something back to it, it can run other code.
setTimeout(function() {
// Do something later.
}, 1000);
While it is waiting for that timeout to return an execute that code, it can run code from OTHER timeouts, or network calls or any other async code in the system. It only runs ONE block of code at a time, however, which is why we say it is single threaded.
That thread can just bounce around. A lot.
And, as others have said, there are web workers and service workers, but those run VERY isolated from your main thread. They can't change values behind your main thread's back.
Updated per comment
The event loop works by:
Waiting for an event
Handling that event.
JavaScript is, indeed, blocked while handling an event. While code is running, nothing else in that page (assuming browser main thread) can run.
It isn't a literal event loop as you would have in C or C++, not as far as the JS is concerned. It's just events waiting to happen.
/// Sample code
document.addEventListener("click", function() { /* Handle click */ });
window.addEventListener("load", function() { /* handle load */ });
In this case, have two event listeners in our code. The JS engine will compile, then execute those two statements. Then, for all intents, "sleep" while waiting for something to happen. In reality, that same thread may handle various house-keeping tasks like drawing the HTML page, listening for move movements and emiting all sorts of events, but that doesn't matter for this discussion.
Then, once the rest of the page is loaded, the browser will emit a load event, which will be caught the listener and some more code will be run.
Then it will go back to idling until someone clicks on the document, then more code will run.
If we change the code to this:
document.addEventListener("click", function() {
while(true);
});
then when someone clicks on the document, our thread will go into an endless loop and all browser activity in that window will cease. Might even freeze the entire browser, depending in which one you are running.
Eventually, the browser will give a chance to kill that task so you can have your system back.
Latest Update
If you are aware of Webassembly there is a proposal in place for Threads via natively compiled modules
pthreads-style read this git issue tracker link(1073)
In continuation with #Jeremy J Starcher answer.
Javascript is always been single threaded runtime using asynchronous, non-blocking and event-driven models of execution.
To know more about event loop execution in JS i highly recommend you to watch this
Youtube video. Simply superb explanation by Philip Roberts.
Good olden days, developers would beat around the bush to achieve similar to thread model using
setTimeout with 0 - milliseconds or setIntervals : Basically instructing the engine to take up non-trivial tasks when the engine goes idle or wait mode during a http request or execute the code by switching back and forth in intervals kinda round-robin fashion.
Hidden Iframe : Run a JS code in a sandbox with a bridge to communicate from parent to iframe and vice versa. Technically Iframe doesn't run on separate thread but gets things done as a fake thread.
Fast forwarding [ >>> ] to Multi-threading models by ECMA:
Off late things have changed with the requirement to spawn a thread in JS engines to offload few smaller logical tasks or a network proxy task to a separate thread and concentrate on UI driven tasks like presentation and interaction layer on main thread, which makes sense.
With that requirement in mind ECMA came up with two model/API basically to solve this.
1. Web Worker: (SIC - Mozilla)
Web Workers makes it possible to run a script operation in background
thread separate from the main execution thread of a web application.
The advantage of this is that laborious processing can be performed in
a separate thread, allowing the main (usually the UI) thread to run
without being blocked/slowed down.
[ WebWorker can be split into two ]
Shared Worker
The SharedWorker interface represents a specific kind of worker that
can be accessed from several browsing contexts, such as several
windows, iframes or even workers. They implement an interface
different than dedicated workers and have a different global scope,
SharedWorkerGlobalScope.
Dedicated Worker : Same as Webworker, created using Worker() API but uses DedicatedWorkerGlobalScope
Worker is an object created using a constructor (e.g. Worker()) that
runs a named JavaScript file — this file contains the code that will
run in the worker thread; workers run in another global context that
is different from the current window. This context is represented by a
DedicatedWorkerGlobalScope object in the case of dedicated workers
2. Service Worker (SIC - Mozilla)
Service workers essentially act as proxy servers that sit between web
applications, and the browser and network (when available). They are
intended to (amongst other things) enable the creation of effective
offline experiences, intercepting network requests and taking
appropriate action based on whether the network is available and
updated assets reside on the server. They will also allow access to
push notifications and background sync APIs.
One example usage would be in PWA - Progressive web app to download scripts, lazy loading purposes of assets.
Read this article by Eric Bidelman on HTML5Rocks good explanation about the code itself and implementation
JavaScript may be "single-threaded" (I'm not sure this is really the case), but you can use/create webworkers to run javascript outside the main thread.
So you can run two pieces of code at the same time in parallel.
I think it is wrong to say that a language is this or that when what we really mean is that our programs are this or that.
For example: NodeJS is single-threaded and can run code asynchronous because it uses an event-driven behaviour. (Something comes up and fires an event... Node deals with it and if it is something like an online request, it does other things instead of waiting for the response... when the response comes, it fires an event and Node captures it and does whatever needs to be done).
So Javascript is...
single-threaded? No, as you can use WebWorkers as a second thread
non-blocking? You can write code that blocks the main thread. Just build a for that executes a hundred million times or don't use callbacks.
asynchronous? No, unless you use callbacks.
concurrent? Yes, if you use webworkers, callbacks or promises (which are really callbacks).

Why is there no synchronous WebSocket support in Web Workers when there is synchronous FileSystem support?

I understand why browser vendors don't want to help me block their UI thread. However, I don't understand why there is:
no sleep(2) in Web Workers
no synchronous WebSockets API
There is a synchronous FileSystem API. There is also a synchronous IndexedDB API. To me, it seems like a contradiction.
The reason why there's not a sleep() function available to WebWorkers is simple: you don't need it. sleep is a synchronous function, (it blocks until it returns) which doesn't make sense in the asynchronous context of WebWorkers.
If you send a message to a WebWorker, it doesn't block waiting for a response; the response is sent as a message to your message handler function. If you want to wait a certain amount of time before sending a response, you wouldn't use sleep, you'd use setTimeout and fire a message off when your function gets called.
Similarly, if you're using WebWorkers for WebSocket data transmission, you'd receive a message from the main thread, send a packet via websocket asynchronously, then in the response handler you'd send a message back to the main thread. There's no logical place to use a synchronous sleep function.
As far as why there's not a synchronous mode for WebSockets like there is for the filesystem, the primary difference is that the filesystem isn't accessed across the network.
Generally, asynchronous APIs are preferable for network-based functions, so I guess I don't see it as much of a contradiction.
IDB is only supported by 3 browsers, none of which have implemented the synchronous API, so I don't see that as a shining example of synchronous APIs. Inf fact, I think that's the contradiction that people would define an API and not bother to implement it.
It is not obvious at all : TCP protocol is a network protocol too, right ? And it is quite often used in synchronous mode, because it makes applications simpler to develop and debug.
In my opinion Async mode is obvious in the context of mono threaded applications, when you don't want I/Os to block a UI. It is very less obivous if you intend to use web workers, for instance, to handle background I/Os. It would indeed be convenient to have synchronous Websocket in conjonction with web workers.
Finally, it is just not carful to assume that a file read call will be done and quickly. You should always have a timeout or accept the fact that your app is going to hang if IO doesn't respond.
For me it is quite obvious.
FileSystem API & IndexedDB API works in order of milliseconds so you can trust have your data right now, instead it, WebSockets API must be at least 100 times slower, the data must fly over the wild internet, so it's obvious to make it asynchronous. Your response can even never back.
Indexed db will not block execution for longer time, most likely it will give result in few milli seconds and we are not expecting to store millions of records in indexed db. Same with file API, most API will result in quicker execution.
Also synchronous API will lead to race conditions and will require multi thread synchronization etc which will increase programming complexity. Instead message based threading is easier to program and we are free from synchronization issues.
Also most javascript engines are stable, and people are familiar with async programming ways. It's easier and only way to write worker. Changing this will require huge rewrite of javascript engines. Introducing more native API will make worker programming more complicated. Different os and different architecture or devices wiki introducr more complexity.
Since V8 has implemented ES2017 await/async, I can use that with Promise-enabled libraries, and I don't need the synchronous API so badly anymore.

Does async programming mean multi-threading?

lets talk about JavaScript code which has setInterval methods every 2 sec.
I also have a onblur animation event for some control.
In a case where onblur occurs (+ animation), I might get the setInterval function.
Question:
Does async programming mean multi-threading? (in any way?)
No. It means literally what it means-- asynchronous. Understanding the difference between asynchronous programming and thread-based programming is critical to your success as a programmer.
In a traditional, non-threaded environment, when a function must wait on an external event (such as a network event, a keyboard or mouse event, or even a clock event), the program must wait until that event happens.
In a multi-threaded environment, many individual threads of programming are running at the same time. (Depending upon the number of CPUs and the support of the operating system, this may be literally true, or it may be an illusion created by sophisticated scheduling algorithms). For this reason, multi-threaded environments are difficult and involve issues of threads locking each other's memory to prevent them from overrunning one another.
In an asychronous environment, a single process thread runs all the time, but it may, for event-driven reasons (and that is the key), switch from one function to another. When an event happens, and when the currently running process hits a point at which it must wait for another event, the javascript core then scans its list of events and delivers the next one, in a (formally) indeterminate (but probably deterministic) order, to the event manager.
For this reason, event-driven, asynchronous programming avoids many of the pitfalls of traditional, multi-threaded programming, such as memory contention issues. There may still be race conditions, as the order in which events are handled is not up to you, but they're rare and easier to manage. On the other hand, because the event handler does not deliver events until the currently running function hits an idle spot, some functions can starve the rest of the programming. This happens in Node.js, for example, when people foolishly do lots of heavy math in the server-- that's best shoved into a little server that node then "waits" to deliver the answer. Node.js is a great little switchboard for events, but anything that takes longer than 100 milliseconds should be handled in a client/server way.
In the browser environment, DOM events are treated as automatic event points (they have to be, modifying the DOM delivers a lot of events), but even there badly-written Javascript can starve the core, which is why both Firefox and Chrome have these "This script is has stopped responding" interrupt handlers.
A single threaded event loop is a good example of being asynchronous in a single threaded language.
The concept here is that you attach doLater callback handlers to the eventLoop. Then the eventLoop is just a while(true) that checks whether the specific timestamp for each doLater handler is met, and if so it calls the handler.
For those interested, here is a naive (and horribly inefficient toy) implementation of a single threaded event loop in JavaScript
This does mean that without any kind of OS thread scheduler access of your single thread, your forced to busy wait on the doLater callbacks.
If you have a sleep call you could just do sleep until the next doLater handler which is more efficient then a busy wait since you deschedule your single thread and let the OS do other things.
No asynchronous programming doesn't mean multithreading specifically.
For achieving multiple tasks at the same time we use multi-threading and other is event loop architecture in node js.
JavaScript is synchronous and single threaded and that is why node js is also single threaded but with event loop node do a non-blocking I/O operations. Node.js uses the libuv library that uses a fixed-sized thread pool that handles the execution of parallel tasks.
Thread is a sequence of code instructions and also the sub unit of process.
In multi-threading, processes have multiple threads.
In single threading, processes have single thread.
Only in the sense that it executes code haphazardly and risks race conditions. You will not get any performance benefits from using timeouts and intervals.
However, HTML5's WebWorkers do allow for real multithreading in the browser:
http://www.html5rocks.com/en/tutorials/workers/basics/
If there is a callback, something has to call it. The units of execution are threads & so, yes, some other thread has to call the callback, either directly or by queueing up some asynchronous procedure call to the initiating thread.

Categories