Is there any way for me to share a variable between two web workers? (Web workers are basically threads in Javascript)
In languages like c# you have:
public static string message = "";
static void Main()
{
message = "asdf";
new Thread(mythread).Run();
}
public static void mythread()
{
Console.WriteLine(message); //outputs "asdf"
}
I know thats a bad example, but in my Javascript application, I have a thread doing heavy computations that can be spread across multiple threads [since I have a big chunk of data in the form of an array. All the elements of the array are independent of each other. In other words, my worker threads don't have to care about locking or anything like that]
I've found the only way to "share" a variable between two threads would be to create a Getter/setter [via prototyping] and then use postMessage/onmessage... although this seems really inefficient [especially with objects, which I have to use JSON for AFAIK]
LocalStorage/Database has been taken out of the HTML5 specification because it could result in deadlocks, so that isn't an option [sadly]...
The other possibility I have found was to use PHP to actually have a getVariable.php and setVariable.php pages, which use localstorage to store ints/strings... once again, Objects [which includes arrays/null] have to be converted to JSON... and then later, JSON.parse()'d.
As far as I know, Javascript worker threads are totally isolated from the main page thread [which is why Javascript worker threads can't access DOM elements
Although postMessage works, it is slow.
Web workers are deliberately shared-nothing -- everything in a worker is completely hidden from other workers and from pages in the browser. If there were any way to share non-"atomic" values between workers, the semantics of those values would be nearly impossible to use with predictable results. Now, one could introduce locks as a way to use such values, to a certain extent -- you acquire the lock, examine and maybe modify the value, then release the lock -- but locks are very tricky to use, and since the usual failure mode is deadlock you would be able to "brick" the browser pretty easily. That's no good for developers or users (especially when you consider that the web environment is so amenable to experimentation by non-programmers who've never even heard of threads, locks, or message-passing), so the alternative is no state shared between workers or pages in the browser. You can pass messages (which one can think of as being serialized "over the wire" to the worker, which then creates its own copy of the original value based on the serialized information) without having to address any of these problems.
Really, message-passing is the right way to support parallelism without letting the concurrency problems get completely out of control. Orchestrate your message handoffs properly and you should have every bit as much power as if you could share state. You really don't want the alternative you think you want.
There are two options to share data between dedicated workers:
1. Shared Workers
The SharedWorker interface represents a specific kind of worker that
can be accessed from several browsing contexts, such as several
windows, iframes or even workers.
Spawning a Shared Worker in a Dedicated Worker
2. Channel Messaging API
The Channel Messaging API allows two separate scripts running in
different browsing contexts attached to the same document (e.g., two
IFrames, or the main document and an IFrame, two documents via a
SharedWorker, or two workers) to communicate directly, passing
messages between one another through two-way channels (or pipes) with
a port at each end.
How to call shared worker from the web worker?
No, but you can send messages to web workers which can be arrays, objects, numbers, strings, booleans, and ImageData or any combination of these. Web workers can send messages back too.
I recently read about (but have not used), shared workers. According to Share the work! Opera comes with SharedWorker support, support is only in the newest browsers (Opera 10.6, Chrome 5, Safari 5).
Related
I have hundreds of Uint8 typed arrays that I store in a list.
I would like to start implementing Web Workers in my code, I need these typed arrays to be available to all my workers at the same time. So I need to create SharedArrayBuffer.
I had in mind to send a list of SharedArrayBuffer instead of a single SharedArrayBuffer that gathers all my data and converts them to a typed array only once at initialization in my workers. This method would simplify the code a lot.
but in all the examples I could find, it was always just a single SharedArrayBuffer that was sent to the worker with postMessage.
Would this be a bad way of using it? Will it cause me any performance issues or anything else?
I'm working on an application where I have multiple webworkers running together. These webworkers are developed by third parties, and are not trusted. They provide postmessage APIs to each other.
I would like to enable the webworkers to have safe access to local storage. IndexedDB is the standard choice, however I need to ensure that a malicious webworker cannot interfere with the data of another webworker.
My original idea was that I could 'domain' each webworker somehow. Each one gets access to its own piece of IndexedDB, and cannot see the storage put in other pieces by other webworkers. At the moment, I do not believe this is possible since I need the workers to exist together in one iframe.
My next idea was to have a single, trusted webworker that has IndexedDB access, and set up sandbox rules for all of the other webworkers such that they can't use IndexedDB at all, but instead must communicate with the API of the trusted webworker to store and retrieve local data. My current understanding is that I can get this to work if I use two iframes, where the first iframe has access to IndexedDB and runs the trusted webworker, and the second iframe is in a different domain where non-malicious webworkers know not to use the storage.
I am not a huge fan of the two iframe solution - it's complex, has performance overheads, and requires webworker devs to know they can't safely use localstorage even though they actually have access - and I'm looking for a better way to sandbox specific webworkers away from indexeddb.
According to this W3C Working Draft, the ScriptProcessorNode is deprecated and will be replaced by AudioWorkerNode.
And Chrome recently has AudioWorklet implemented to replace ScriptProcessorNode.
Are those two APIs the same thing? Does Chrome just implement it with a different name?
Yes it has just been renamed AudioWorklet, even in the specs probably to mark the distinction between this API and Web Worker, since this API is more like the Worklets API proposed by CSS-WG.
They key points Worklets do differ to Workers are (according to houdini page)
Worklets are similar to web workers however they:
Are thread-agnostic. That is, they are not defined to run on a particular thread. Rendering engines may run them wherever they choose.
Are able to have multiple duplicate instances of the global scope created for the purpose of parallelism.
Are not event API based. Instead classes are registered on the global scope, whose methods are to be invoked by the user agent.
Have a reduced API surface on the global scope.
Have a lifetime for the global scope which is defined by subsequent specifications or user agents. They aren’t tied to the lifetime of the document.
As worklets have a relatively high overhead, they should be used sparingly. Due to this worklets are expected to be shared between separate scripts.
Is web worker just a normal native thread created by browser to run and communicate with other browser threads with a messaging queue? Or doesn't it contain other things when created?
I'm looking at the experimental support of pthread in emscripten, multiple threads in C++ will be translated to web workers once compiled. But will it have the same level of performance as native code? After all fine grained multithreading is a key feature in C++.
At the moment, WebWorkers are pretty heavyweight because the VM duplicates a bunch of its internal state (sometimes even re-JITs code for a worker). That state is much bigger than a native thread's few MiBs of initial stack space and associated state.
Some of this can be fixed by implementations, and I expect that if SharedArrayBuffer or WebAssembly + threads become popular then browser engines will want to optimize things.
That being said, the end of your question hints at a misunderstanding of what thread overheads are, and how the proposal for SharedArrayBuffer (which Emscripten relies on to support pthreads) works. WebWorkers are heavy at the moment, but they can communicate through SABs in exactly the same way native code such as C++ can: by accessing exactly the same memory, at the same virtual address. SAB adds a new kind of ArrayBuffer to JavaScript which doesn't get neutered when you postMessage it to another worker. Multiple workers can then see other worker's updates to the buffer in exactly the same way C++ code when you use std::atomic.
At the same time, workers can't block the main thread by definition and therefore have a more "native" feel. Some web APIs aren't available to all workers, but that's been changing. This becomes relevant if you e.g. write a game and have network / audio / render / AI / input in different threads. The web is slowly finding its own way of doing these things.
The details are a bit trickier
SAB currently only supports non-atomic accesses, and sequentially-consistent accesses (i.e. the only available Atomic access at the moment is the same as C++'s std::memory_order_seq_cst). Doing non-atomic accesses should be about as performant as C++'s non-atomics (big compiler caveats here which I won't get into), and using Atomic should be about as performant as C++'s std::atomic default (which is std::memory_order_seq_cst). C++ has 5 other memory orders (relaxed, consume, acquire, release, acq_rel) which SAB doesn't support at the moment. These other memory orders allow native code to be faster in some circumstances, but are harder to use correctly and portably. They might be added to future updates to SAB, e.g. through this issue.
SAB also supports Futex which native programs rely on under the hood to implement efficient mutex.
There are even trickier details when comparing to C++, I've detailed some of them but there are even more.
This question already has answers here:
Is there a way to create out of DOM elements in Web Worker?
(10 answers)
Closed 7 years ago.
We all know we can spin up some web workers, give them some mundane tasks to do and get a response at some point, at which stage we normally parse the data and render it somehow to the user right.
Well... Can anyone please provide a relatively in-depth explanation as to why Web Workers do not have access to the DOM? Given a well thought out OO design approach, I just don't get why they shouldn't?
EDITED:
I know this is not possible and it won't have a practical answer, but I feel I needed a more in depth explanation. Feel free to close the question if you feel it's irrelevant to the community.
The other answers are correct; JS was originally designed to be single-threaded as a simplification, and adding multi-threading at this point has to be done very carefully. I wanted to elaborate a little more on thread safety in Web Workers.
When you send messages to workers using postMessage, one of three things happen,
The browser serializes the message into a JSON string, and de-serializes on the other end (in the worker thread). This creates a copy of the object.
The browser constructs a structured clone of the message, and passes it to the worker thread. This allows messaging using more complex data types than 1., but is essentially the same thing.
The browser transfers ownership of the message (or parts of it). This applies only for certain data types. It is zero-copy, but once the main context transfers the message to the worker context, it becomes unavailable in the main context. This is, again, to avoid sharing memory.
When it comes to DOM nodes,
is obviously not an option, since DOM nodes contain circular references,
could work, but in a read-only mode, since any changes made by the worker to the clone of the node will not be reflected in the UI, and
is unfortunately not an option since the DOM is a tree, and transferring ownership of a node would imply transferring ownership of the entire tree. This poses a problem if the UI thread needs to consult the DOM to paint the screen and it doesn't have ownership.
The entire world of Javascript in a browser was originally designed with a huge simplification - single threadedness (before there were webWorkers). This single assumption makes coding a browser a ton simpler because there can never be thread contention and can never be two threads of activity trying to modify the same objects or properties at the same time. This makes writing Javascript in the browser and implementing Javascript in the browser a ton simpler.
Then, along came a need for an ability to do some longer running "background" type things that wouldn't block the main thread. Well, the ONLY practical way to put that into an existing browser with hundreds of millions of pages of Javascript already out there on the web was to make sure that this "new" thread of Javascript could never mess up the existing main thread.
So, it was implemented in a way that the ONLY way it could communicate with the main thread or any of the objects that the main thread could modify (such as the entire DOM and nearly all host objects) was via messaging which, by its implementation is naturally synchronized and safe from thread contention.
Here's a related answer: Why doesn't JavaScript get its own thread in common browsers?
Web Workers do not have DOM access because DOM code is not thread-safe. By DOM code, I mean the code in the browser that handles your DOM calls.
Making thread-unsafe code safe is a huge project, and none of the major browsers are doing it.
Servo project is writing new browser from scratch, and I think they are writing their DOM code with thread-safety in mind.