Prevent webworkers from using IndexedDB

Prevent webworkers from using IndexedDB - javascript

I'm working on an application where I have multiple webworkers running together. These webworkers are developed by third parties, and are not trusted. They provide postmessage APIs to each other.
I would like to enable the webworkers to have safe access to local storage. IndexedDB is the standard choice, however I need to ensure that a malicious webworker cannot interfere with the data of another webworker.
My original idea was that I could 'domain' each webworker somehow. Each one gets access to its own piece of IndexedDB, and cannot see the storage put in other pieces by other webworkers. At the moment, I do not believe this is possible since I need the workers to exist together in one iframe.
My next idea was to have a single, trusted webworker that has IndexedDB access, and set up sandbox rules for all of the other webworkers such that they can't use IndexedDB at all, but instead must communicate with the API of the trusted webworker to store and retrieve local data. My current understanding is that I can get this to work if I use two iframes, where the first iframe has access to IndexedDB and runs the trusted webworker, and the second iframe is in a different domain where non-malicious webworkers know not to use the storage.
I am not a huge fan of the two iframe solution - it's complex, has performance overheads, and requires webworker devs to know they can't safely use localstorage even though they actually have access - and I'm looking for a better way to sandbox specific webworkers away from indexeddb.

Related

Memory mapped equivalent for FirefoxOS

How would you emulate a memory mapped file in FirefoxOS, Tizen or any other mobile pure-JS solution?
The use case is for a mobile browser and you need lots of data which does not fit in the RAM or you don't want to waste RAM for it yet and prefer to lazy load it.
The only thing I found is IndexedDB or what can I do about it? Any better tricks or APIs?
Hmmh it looks like Web SQL Database could be also a solution on Android, Tizen or iOS. But Firefox does not support it (?)
Update: I'm asking because of some experiments

First thing first, Web SQL won't be ever standardised as explained in the specification, so it should be considered only for WebKit/Blink-based browsers.
There is an awesome overview of offline storage options in this queston, even though the map tiles are considered in that question, I think it is still relevant to your use case.
I believe you are on the right track with IndexedDB for the graph data. On a high level it is a key-value asynchronous object store (see the Basic Concepts document). For your use case, you could index graph nodes in an object store. There is, for example, LevelGraph library which stores graph data in IndexedDB, though it is built for Semantic Web triples. HeliosJS is also worth mentioning, though it is an in-memory graph database.
Edit: Current API for IndexedDB is asynchronous. There is synchronous API drafted in the spec, which could be used only in web workers. Unfortunately no engine currently implements this feature. There is a pending patch for Gecko, but I did not find any plans for Blink or WebKit, so it is not a meaningful option now.
It is possible to access raw files through Web APIs. You could use XHR2 to load a (local) file as a binary Blob. Unfortunately XHR2 is mostly designed for streaming files and not for a random access, though you could split-up the data into multiple files and request them on demand, but that may be slow.
The direct access to files is currently quite limited, FileList and createObjectURL are primarily used for direct file user input (through Drag and Drop or file input field), FileSystem API was recently killed, and the DeviceStorage is non-standard and privileged (Firefox OS-specific). You can also store files in IndexedDB, which is described for FileHandle API. However, once you manage to get access to the raw File object, you can use the Blob.slice method to load chunks of the file – there is a great example of reading file chunks via upload form.
You may also want to look at jDataView library & friends, which eases handling of binary data through the more efficient ArrayBuffer.
Edit: As for the synchronous API, localStorage (aka DOM Storage) could be considered too. It is also a key-value storage, but much much simpler and more limited than IndexedDB:
Size of the storage is limited, usually to 5 MB
Only one localStorage per domain/application (you can have multiple named object stores in IndexedDB).
Only strings can be stored.
In general, localStorage is useful cookies replacement, but it is not really useful for storing large offline data.
So to sum it up:
IndexedDB is the easiest and widely available option, though it may be slow, inefficient or hit memory limits with very large data; also, only asynchronous API is currenlty possible.
Raw file access is hard to obtain without user interaction and the APIs are unstable and non-standard.
In the end, you can combine both approaches, two options come in mind:
Use XHR2 to parse the large file in chunks and store the parsed nodes into IndexedDB
Store the large file into IndexedDB (via XHR), use FileHandle.getFile to load the File object and Blob.slice to read its content.
In all cases, you can (should) use Web Workers to handle data manipulation and calculations in the background.
Anyway, GraphHopper looks great, we really lack such non-trivial offline applications for Firefox OS, so good luck!

Looking for examples of accesing IndexedDB from several scripts

I'm trying to implement job processor using workers in background.
I will store some job-related information in IndexedDB.
I tried to find some information, related to accessing same IndexedDB database from multiple scripts, multiple workers in my case, stuff with version change explained in that case, but could find anything useful.
I need some information on that topic...

You can look at my IDB library as an example of IDB in Web Workers.
Things to note:
IDB in a Web Worker does not work in Firefox. Although the spec says it should allow async access (not to mention non-existing sync access), and the Mozilla IDB dev is championing their ticket for this, Mozilla's bug tracker suggests to me lots of issues need to be worked out and that this won't be available in the near future (as of 3/2014)
Like IDB when it stores data, Web Workers use a structured clone algo to pass data between worker thread and parent. This means that all your objects need to be cloneable. So you need to transform IDBObjectstore, DOMStringList, etc. into plain vanilla JS objects.
Otherwise, IDB in Web Workers is great. Personally I think this is the best way to fetch data without any chance of locking the UI.

Client Side Storage/Caching Large HTML

I have a web page that references and initializes multiple instances of the same ASP.NET generic user control.
What I want to do, is to cache/store the entire contents (html) of those controls somewhere on the client using the jquery detach() method.
The solution of localStorage does not fit here as it has a limit to 5MB which is low for my needs.
For now, i am using a global variable and more specific a javascript array(key-value) to store the data.
What do you think of this solution? Will I notice any lags or performance issues on browser? Is there any limit for that global var?
Also, is there a better way to implement such a task?

For cross browser compatibility, you can try an AJAX call that pulls/increments in your massive data and cache the call (stored as JSON/JSONP). jQuery has a cache mechanism but the meat of the implementation is going to be on the headers of the page call. Specifically you're going to want to add Expires, Last-Modified, and Cache-Control on the pages you AJAX in.
Then you'll want to pull in the data asynchronously and do the appropriate UI manipulation (if needed).
You don't want to store massive data in a single variable since its going to take longer when it goes through the JS process.
localStorage is still an edge technology, is implemented differently across vendors, and isn't backwards compatible (although there are JavaScript libs that help mitigate backwards compatibility)
Cookies not big enough
On-Page JSON or JS Variable You lose abstraction and increase initial page weight (which is going to be unsatisfactory if you're on mobile)
Whatever implementation you do, I would run some simple benchmark performance tests so you have the metric to backup your code

This will cause browser lag and multiple issues. You can pretty much guarantee that a mobile browser isn't going to work in this scenario because no sensible mobile browser is going to let you download and store 5MB+ in the LocalStorage object. It is a really bad idea to put 5MB+ of HTML into the DOM of any browser and not expect any kind of performance issue.
If you're not concerned about mobile, then look at IndexedDB. It allows a greater amount of storage and it persists even after the session is closed. It is fairly well supported in recent Chrome and Firefox browsers, but requires IE10 or higher.

Is there any way to automatically synchronize html5 localstorage between computers

I have a simple offline html5/javascript single-html-file web application that I store in my dropbox. It's a sort of time tracking tool I wrote, and it saves the application data to local storage. Since its for my own use, I like the convenience of an offline app.
But I have several computers, and I've been trying to come up with any sort of hacky way to synchronize this app's data (which is currently using local storage) between my various machines.
It seems that chrome allows synchronization of data, but only for chrome extensions. I also thought I could perhaps have the web page automatically save/load its data from a file in a dropbox folder, but there doesn't appear to be a way to automatically sync with a specific file without user prompting.
I suppose the "obvious" solution is to put the page on a server and store the data in a database. But suppose I don't want a solution which requires me to maintain apps on a server - is there another way, however hacky, to cobble together synchronization?
I even looked for a while to see if there was a vendor offering a web database service - where I could, say, post/get a blob of json on demand, and then somehow have my offline app sync with this service, but the same-origin policy seems to invalidate that plan (and besides I couldn't find such a service).
Is there a tricky/sneaky solution to this problem using chrome, or google drive, or dropbox, or some other tool I'm not aware of? Or am I stuck setting up my own server?

I have been working on a Project that basically gives you versioned localStorage with support for conflict resolution if the same resource ends up being edited by two different clients. At this point there are no drivers for server or client (they are async in-memory at the moment for testing purposes) but there is a lot of code and abstraction to make writing your own drivers really easy... I was even thinking of doing a dropbox/google docs driver myself, except I want DynamoDB/MongoDB and Lawnchair done first.
The code is not dependent on jQuery or any other libraries and there's a pretty full features (though ugly) demo for it as are well.
Anyway the URL is https://github.com/forbesmyester/SyncIt

Apparently, I have exactly the same issue and invetigated it thoroghly. The best choice would be remoteStorage, if you could manage to make it work. It allows to use 3rd party server for data storage or run your own instance.

Sharing variables between web workers? [global variables?]

Is there any way for me to share a variable between two web workers? (Web workers are basically threads in Javascript)
In languages like c# you have:
public static string message = "";
static void Main()
{
message = "asdf";
new Thread(mythread).Run();
}
public static void mythread()
{
Console.WriteLine(message); //outputs "asdf"
}
I know thats a bad example, but in my Javascript application, I have a thread doing heavy computations that can be spread across multiple threads [since I have a big chunk of data in the form of an array. All the elements of the array are independent of each other. In other words, my worker threads don't have to care about locking or anything like that]
I've found the only way to "share" a variable between two threads would be to create a Getter/setter [via prototyping] and then use postMessage/onmessage... although this seems really inefficient [especially with objects, which I have to use JSON for AFAIK]
LocalStorage/Database has been taken out of the HTML5 specification because it could result in deadlocks, so that isn't an option [sadly]...
The other possibility I have found was to use PHP to actually have a getVariable.php and setVariable.php pages, which use localstorage to store ints/strings... once again, Objects [which includes arrays/null] have to be converted to JSON... and then later, JSON.parse()'d.
As far as I know, Javascript worker threads are totally isolated from the main page thread [which is why Javascript worker threads can't access DOM elements
Although postMessage works, it is slow.

Web workers are deliberately shared-nothing -- everything in a worker is completely hidden from other workers and from pages in the browser. If there were any way to share non-"atomic" values between workers, the semantics of those values would be nearly impossible to use with predictable results. Now, one could introduce locks as a way to use such values, to a certain extent -- you acquire the lock, examine and maybe modify the value, then release the lock -- but locks are very tricky to use, and since the usual failure mode is deadlock you would be able to "brick" the browser pretty easily. That's no good for developers or users (especially when you consider that the web environment is so amenable to experimentation by non-programmers who've never even heard of threads, locks, or message-passing), so the alternative is no state shared between workers or pages in the browser. You can pass messages (which one can think of as being serialized "over the wire" to the worker, which then creates its own copy of the original value based on the serialized information) without having to address any of these problems.
Really, message-passing is the right way to support parallelism without letting the concurrency problems get completely out of control. Orchestrate your message handoffs properly and you should have every bit as much power as if you could share state. You really don't want the alternative you think you want.

There are two options to share data between dedicated workers:
1. Shared Workers
The SharedWorker interface represents a specific kind of worker that
can be accessed from several browsing contexts, such as several
windows, iframes or even workers.
Spawning a Shared Worker in a Dedicated Worker
2. Channel Messaging API
The Channel Messaging API allows two separate scripts running in
different browsing contexts attached to the same document (e.g., two
IFrames, or the main document and an IFrame, two documents via a
SharedWorker, or two workers) to communicate directly, passing
messages between one another through two-way channels (or pipes) with
a port at each end.
How to call shared worker from the web worker?

No, but you can send messages to web workers which can be arrays, objects, numbers, strings, booleans, and ImageData or any combination of these. Web workers can send messages back too.

I recently read about (but have not used), shared workers. According to Share the work! Opera comes with SharedWorker support, support is only in the newest browsers (Opera 10.6, Chrome 5, Safari 5).

We Keep Coding

JavaScript is the programming language of the Web.