I have a general question that I'm having trouble grappling with about web workers. I understand that they engage in background calculations in another thread so they take off the load from the window that the user is in.
However I'm confused on whether that 'other thread' means something like having a different program running on the computer, having a separate browser open, or whether it's like a new tab in the same browser. I feel that this is more of the latter case, but I'm not 100% sure about that and I can't find good explanations.
What implications does this have on the limitations of what we can do with web workers?
Thanks in advance!
A webworker works like an independent thread of execution. Multiple threads can run at the same time in a computer process. If there are multiple processors, these multiple threads can actually run at the same time. If there is only a single processor, then the OS on the computer handles time slicing between the different threads such that each one runs for a short while, then the next one runs and, to the casual observer, they appear to be running at the sametime.
In a browser, a webworker is indeed a thread of execution that runs independently of the browser window thread (of which there is one for each browser page that is open in the browser). The browser window thread has a number of limititations. The main limitation is that it only processes user events (mouse movement, mouse clicks, keyboard events, etc...) when no javascript code is also running in the main browser thread. So, if you were to run some long running javascript code in the main browser thread, the browser will "appear" to be locked up and won't process any user events while that javascript is running. This is generally considered a bad user experience.
But, if you run this javascript in a webworker, it can go do it's long running thing without blocking the processing of events in the main browser window thread. When it finishes its long running computation, it can then send a message to the main browser window thread and the result can be processed (e.g. displayed in the page or whatever the particular action is).
There are ways to work-around the limitations of the main browser thread by breaking your work into small chunks and executing small chunks of work on a recurring timer. But, using a web worker thread can significantly simplify the programming.
Web workers themselves cannot access the browser page in any way. They can't read values out of it or modify it - they can't run animations, etc... This limits their usefulness a bit to tasks that are more independent from the page. The classic use is some long running calculation (e.g. analyzing data from an image, carrying out ajax calls, doing some complex calculation, etc...). Web workers can communicate with the main thread via a messaging system. It's kind of like leaving a voicemail. The webworker calls up the main thread and leaves a message for it. The next time the main thread has nothing to do, it checks to see if there are any messages from web workers and if so, it processes them. In this way, the main thread and the web worker thread can communicate, but one cannot interrupt the other while it's doing something else.
Related
I have a basic question related to JavaScript.
The scenario is a user scrolling down a news feed of some social media app.
This would have three different processes occurring in-between each other.
on scroll event
creating ajax request to some URL
receiving responses to these requests.
There can be any relative order of these once the first request is fired.
So my question being is there two different processes running at the same time?
one to listen to on-scroll events.
one to listen to the received response.
If so how is it synchronized on the event loop and how does the process run (given the single-threaded nature of js and possible single-core environment)?
Are these processes spanned by the browser? If so how is the synchronism maintained between event loop, browser stack, execution stack, and these processes?
JavaScript is single threaded but the browser isn't.
By saying that a JavaSript is single threaded we mean that there's only one sequence of instructions currently executing at a given time (on the main thread, you can now have worker threads running in parallel to main thread similar to the situation when you have two browser tabs open).
JavaScript engine (in chrome it's V8) is part of the browser and it executes JavaScript code according to the ECMAScript specification.
In the browser there are built in APIs added to JavaScript so it might seem that they are part of the JS itself but they're not.
DOM is a browser API. When you use any functionality related to the DOM, JavaScript gets out of its "world" and communicates with external services.
It's not JavaScript engine that renders and changes the page. When you use DOM API you aren't changing the document directly - you're just telling the browser renderer what changes needs to be done. Even the event loop is not a part of V8 - it's a browser construct.
Events are picked up by the browser and when there's some listener associated with a particular event then the handler is added to event loop (which is just a queue with things to do). When there's some job in the queue it's performed by V8 and when it completes the next job is taken care of.
I am no beginner in javascript. I am actually working on this for past 3-4 months but today I read this statement about "What is JavaScript?"
JavaScript is single-threaded, non-blocking, asynchronous, concurrent language.
and I was lost. If JavaScript is single-threaded, how can it be concurrent and how can it be asynchronous because you need to keep track what your async code is doing and without another thread, it is impossible to track 2 or more code at a same time?
Ah.. here's the thing:
JavaScript is single threaded, but it has a lot of spare time on its hands.
When it is waiting for something to load out off the network, or its waiting for something off disk or waiting for the OS to hand something back to it, it can run other code.
setTimeout(function() {
// Do something later.
}, 1000);
While it is waiting for that timeout to return an execute that code, it can run code from OTHER timeouts, or network calls or any other async code in the system. It only runs ONE block of code at a time, however, which is why we say it is single threaded.
That thread can just bounce around. A lot.
And, as others have said, there are web workers and service workers, but those run VERY isolated from your main thread. They can't change values behind your main thread's back.
Updated per comment
The event loop works by:
Waiting for an event
Handling that event.
JavaScript is, indeed, blocked while handling an event. While code is running, nothing else in that page (assuming browser main thread) can run.
It isn't a literal event loop as you would have in C or C++, not as far as the JS is concerned. It's just events waiting to happen.
/// Sample code
document.addEventListener("click", function() { /* Handle click */ });
window.addEventListener("load", function() { /* handle load */ });
In this case, have two event listeners in our code. The JS engine will compile, then execute those two statements. Then, for all intents, "sleep" while waiting for something to happen. In reality, that same thread may handle various house-keeping tasks like drawing the HTML page, listening for move movements and emiting all sorts of events, but that doesn't matter for this discussion.
Then, once the rest of the page is loaded, the browser will emit a load event, which will be caught the listener and some more code will be run.
Then it will go back to idling until someone clicks on the document, then more code will run.
If we change the code to this:
document.addEventListener("click", function() {
while(true);
});
then when someone clicks on the document, our thread will go into an endless loop and all browser activity in that window will cease. Might even freeze the entire browser, depending in which one you are running.
Eventually, the browser will give a chance to kill that task so you can have your system back.
Latest Update
If you are aware of Webassembly there is a proposal in place for Threads via natively compiled modules
pthreads-style read this git issue tracker link(1073)
In continuation with #Jeremy J Starcher answer.
Javascript is always been single threaded runtime using asynchronous, non-blocking and event-driven models of execution.
To know more about event loop execution in JS i highly recommend you to watch this
Youtube video. Simply superb explanation by Philip Roberts.
Good olden days, developers would beat around the bush to achieve similar to thread model using
setTimeout with 0 - milliseconds or setIntervals : Basically instructing the engine to take up non-trivial tasks when the engine goes idle or wait mode during a http request or execute the code by switching back and forth in intervals kinda round-robin fashion.
Hidden Iframe : Run a JS code in a sandbox with a bridge to communicate from parent to iframe and vice versa. Technically Iframe doesn't run on separate thread but gets things done as a fake thread.
Fast forwarding [ >>> ] to Multi-threading models by ECMA:
Off late things have changed with the requirement to spawn a thread in JS engines to offload few smaller logical tasks or a network proxy task to a separate thread and concentrate on UI driven tasks like presentation and interaction layer on main thread, which makes sense.
With that requirement in mind ECMA came up with two model/API basically to solve this.
1. Web Worker: (SIC - Mozilla)
Web Workers makes it possible to run a script operation in background
thread separate from the main execution thread of a web application.
The advantage of this is that laborious processing can be performed in
a separate thread, allowing the main (usually the UI) thread to run
without being blocked/slowed down.
[ WebWorker can be split into two ]
Shared Worker
The SharedWorker interface represents a specific kind of worker that
can be accessed from several browsing contexts, such as several
windows, iframes or even workers. They implement an interface
different than dedicated workers and have a different global scope,
SharedWorkerGlobalScope.
Dedicated Worker : Same as Webworker, created using Worker() API but uses DedicatedWorkerGlobalScope
Worker is an object created using a constructor (e.g. Worker()) that
runs a named JavaScript file — this file contains the code that will
run in the worker thread; workers run in another global context that
is different from the current window. This context is represented by a
DedicatedWorkerGlobalScope object in the case of dedicated workers
2. Service Worker (SIC - Mozilla)
Service workers essentially act as proxy servers that sit between web
applications, and the browser and network (when available). They are
intended to (amongst other things) enable the creation of effective
offline experiences, intercepting network requests and taking
appropriate action based on whether the network is available and
updated assets reside on the server. They will also allow access to
push notifications and background sync APIs.
One example usage would be in PWA - Progressive web app to download scripts, lazy loading purposes of assets.
Read this article by Eric Bidelman on HTML5Rocks good explanation about the code itself and implementation
JavaScript may be "single-threaded" (I'm not sure this is really the case), but you can use/create webworkers to run javascript outside the main thread.
So you can run two pieces of code at the same time in parallel.
I think it is wrong to say that a language is this or that when what we really mean is that our programs are this or that.
For example: NodeJS is single-threaded and can run code asynchronous because it uses an event-driven behaviour. (Something comes up and fires an event... Node deals with it and if it is something like an online request, it does other things instead of waiting for the response... when the response comes, it fires an event and Node captures it and does whatever needs to be done).
So Javascript is...
single-threaded? No, as you can use WebWorkers as a second thread
non-blocking? You can write code that blocks the main thread. Just build a for that executes a hundred million times or don't use callbacks.
asynchronous? No, unless you use callbacks.
concurrent? Yes, if you use webworkers, callbacks or promises (which are really callbacks).
JavaScript is single threaded - Silverlight is not, but interaction between JavaScript and Silverlight must be performed on the Silverlight UI thread.
However, what exactly is the relationship between the Silverlight UI thread and the JavaScript thread? Are they by any definition the same thread, or separate threads with the interactions performed purely through the respective event loops and blocking one thread when waiting for the other (when evaluating/calling JavaScript from Silverlight for example)? Put another way, can JavaScript execute concurrently with Silverlight actions on the UI thread (and can multiple Silverlight instances hosted in the same page have their UI threads running concurrently)?
I haven't used Silverlight, but I've done pretty extensive work with Java Applets and Flash, so I'll comment from that perspective.
You're right that JavaScript is single-threaded. Anything that causes it to block will prevent all other computation and actions. It will even lock the browser in some cases, though newer browsers are getting better at separating out tabs into separate processes, which helps.
Any thread in a plugin like Silverlight is completely separate from JavaScript in the browser. The interfaces between them may be blocking however. If Silverlight's UI thread blocks when communicating with native JS, then no other work will be done on that thread while it's waiting. Other threads can continue to work as normal.
To address your question about whether JS can execute concurrently while actions on the Silverlight UI thread are running, I don't see why not. They have separate runtimes, and as long as they're not intercommunicating (which would cause one to block), they should be able to keep running fine in isolation.
My gut says the same would be true of multiple Silverlight instances in the same page, but that's really an architectural design question that I'm not able to answer.
Hope this helps!
I have recently heard about the Web Workers spec that defines API for multi-threading JavaScript. But after working with client side scripting for so long now (and event-driven paradigm), I don't really see a point with using multiple thread.
I can see how the JavaScript engine and browser rendering engine can benefit from multi-threading, but I really don't see much benefit in handing this power to application programmers.
The Wikipedia article actually answers your question fairly well.
The power is given to us developers so that we can specifically offload tasks that would be disruptive to users to a web worker. The browser does not know which scripts are necessary for your custom interface to function properly, but you do.
If you've got a script that blocks the page rendering for 10 seconds but isn't necessary for the website to function, you could offload it to a web worker. Doing so allows your users to interact with the page instead of forcing them to wait 10 seconds for that script to execute. In a way, it's like AJAX in that things can be injected in after the interface loads so as to not delay users' interaction.
Node.js server is works on event based models where callback functions are supported. But I am not able to understand how is it better than traditional thread based servers where threads wait for system IO. In case of thread based model, when a thread needs to wait for IO, it gets preempted so doesn't consume CPU cycles hence doesn't contribute to wait time.
How Node.js improves wait time?
when a thread needs to wait for IO, it gets preempted
Actually, it's not preempted. Preemption is something completely different. What happens is that the thread is blocked.
For an event based model something similar happens. Event based interpreters are basically state machines. Only, the state machine is abstracted away and is not visible to the user. When something is waiting for an event it passes the control back to the interpreter. When the interpreter has nothing else to process it blocks itself waiting for I/O. Only, unlike traditional threading code the interpreter waits for multiple I/O.
What's happening at the C level is that the interpreter is using something like select(), poll(), epoll() and friends (depends on the OS and library installed) to do the blocking and waiting for I/O.
Now, why does a select()/poll() based mechanism generally perform better? Actually, 'generally' here depends on what you mean. A select() based server executes all code in a single process/thread. The biggest performance gain from this is that it avoids context switching - every time the OS transfers control over from one thread to another it has to save all the relevant registers, memory map, stack pointers, FPU context etc. so that the other thread can resume execution where it left off. The overhead of doing this can be quite significant.
In fact, there is a historical example of how extreme the overhead can be. Back in the early 2000s someone started benchmarking web servers. To the surprise of everyone, tclhttpd outperformed Apache for serving static files. Now, tcl is not only an interpreted language, but back in 2000 it was a very slow interpreted language because it didn't have a seperate compilation phase (it sort of does now). Tcl scripts are interpreted directly in string form making it around 400x slower than C. Apache is obviously written in C so what's making tclhttpd faster?
It turned out that tclhttpd is event based running only on a single thread while Apache was multithreaded. The overhead of constant thread switching turned out to give tclhttpd enough advantage to perform better than Apache.
Of course, there is always a compromise. A single threaded server like tclhttpd or node.js cannot take advantage of multiple CPUs. Back in the early 2000s multiple CPUs were uncommon. These days they are almost default. Not to mention that most CPUs are also hyperthreaded (hyperthreading adds hardware to the CPU to make context switching cheap).
The best servers these days have learned from history and are a combination of both. Apache2, and Nginx use therad pools: they are multithreaded but each thread serves more than a single connection. This is a hybrid of the two approaches but is more complex to manage.
Read the following article for a more in-depth discussion on this topic: The C10K problem
Threads are relatively heavy-weight objects that have a resource footprint extending all the way into the kernel. When you park a thread in a blocking syscall or on a mutex or condition variable, you are tying up all those resources but doing nothing. Now the OS has to find more resources so your program can create another thread... Then you idle them too. It doesn't take long before the OS is struggling to scavenge more resources for your program to waste.
CPU time is just one small part of he bigger picture. :-)
Simply put:
In a threaded server, no matter how many threads you have, you can always have that many threads waiting for IO.
In node, no matter how many IO operations are pending, you always have your event loop ready to do the next thing.
When having a lot of threads you are going to have a lot of context switching which is going to be expensive. You want have this overhead when using node.js's Event loop
Context Switch
A context switch is the
computing process of storing and
restoring state (context) of a CPU so
that execution can be resumed from the
same point at a later time.
Event loop
In computer science, the event loop,
message dispatcher, message loop or
message pump is a programming
construct that waits for and
dispatches events or messages in a
program.
I think you are full of myths regarding to threads and cost of context switching.
Discover yourself the truth.