How can I access chrome.history from a Web Worker?

How can I access chrome.history from a Web Worker? - javascript

Is there any way I can access the chrome.* apis (specifically chrome.history) from a Web Worker?
If I pass the chrome.history or chrome object in with postMessage, it is not working because of a conversion error to Transferable type.
I can successfully query the history from my extension and pass the results, but I would like to leave the heavy lifting to the worker instead of the main thread and then pass the results.

Web Workers are meant to be light-weight, and do not inherit any permissions (not even host permissions) from the extension (besides, chrome is not even defined in a Web worker).
If you're doing really heavy stuff with the results of the chrome.history API, then you could pass the result of a callback to a worker for processing (with Transferables, the overhead is minimal). Before doing that, make sure that you profile whether the performance impact is really that significant to warrant implementing anything like this.

Related

How are ServiceWorkers and ServiceWorkerContainer.prototype.register, used for fingerprinting and bot detection

I often see websites that have advanced bot detection/fingerprinting technology that make use of ServiceWorkers and/or ServiceWorkerContainer.prototype.register. I'm not sure what exactly they do with it. There is also a chrome extension, Dont FingerPrint Me, that claims ServiceWorkers are used for fingerprinting and provides a feature to detect when a website does so. However, it doesn't explain how they are used. I've tried understanding it by reading the code, but did not get anywhere.
So, my question is, how can that be used for fingerprinting or detecting bots? By bots I mean browsers automated via selenium, remote debugger, or some other automation tool.
Edit: Unfortunately, atm I don't have any links saved of some sites I've previously come across that are using this technology. If I find one again I will update the post.
Edit: I was told that they can be used to bypass fingerprint blockers a browser might be running and to detect spoofed properties. I'm not sure how valid this information is however.

This topic is new to me so this is likely not exhaustive:
The W3C provides a Privacy Section on Service Workers that contains the following:
Service workers introduce new persistent storage features including scope to registration map (for service worker registrations and their service workers), request response list and name to cache map (for caches), and script resource map (for script resources). In order to protect users from any potential unsanctioned tracking threat, these persistent storages should be cleared when users intend to clear them and should maintain and interoperate with existing user controls e.g. purging all existing persistent storages. src
Looking at some example of code from Google, it seems that a service worker can declare any number of files to cache during its install event
self.addEventListener('install', (event) => {
event.waitUntil(
caches.open('v1').then((cache) => {
return cache.addAll([
'./sw-test/',
'./sw-test/index.html',
'./sw-test/style.css',
'./sw-test/app.js',
'./sw-test/image-list.js',
'./sw-test/star-wars-logo.jpg',
'./sw-test/gallery/',
'./sw-test/gallery/bountyHunters.jpg',
'./sw-test/gallery/myLittleVader.jpg',
'./sw-test/gallery/snowTroopers.jpg'
]);
})
);
});
src
It's seems conceivable that any of files cached by the service could contain a user identifying fingerprint for further tracking.
A related threat, not directly related with Service Workers but closely coupled, are exploits within the manifest.json file.
It's conceivable that the start_url could be crafted to indicate that the application was launched from outside the browser (e.g., "start_url": "index.html?launcher=homescreen"). This can be useful for analytics and possibly other customizations. However, it is also conceivable that developers could encode strings into the start_url that uniquely identify the user (e.g., a server assigned UUID). This is fingerprinting/privacy sensitive information that the user might not be aware of.
Given the above, it is RECOMMENDED that, upon installation, or any time thereafter, a user agent allows the user to inspect and, if necessary, modify the start URL of an application. src
Further Reading
I found at least 1 SO post attempting to incorporate fingerprintjs2 into a Service Worker:
Fingerprinting with service workers
I found this blog post on how to use Google Analytics to collect both online and offline user metrics in PWAs using Service Workers.
https://builtvisible.com/google-analytics-for-pwas-tracking-offline-behaviour-and-more/

What are the restrictions on what can and cannot be done in a a service worker?

Further to Can I http poll or use socket.io from a Service Worker on Safari iOS? what is the list of what can and cannot be done in a service worker? The answer referenced above says "You cannot ... have an open connection of any sort to your server" which makes sense, but where is that fact documented and how is the restriction enforced?
For example, are certain browser APIs unavailable to Service Workers? or is there an execution quota which prevents a long running process?
Eg. if my service worker has ...
setInterval (()=>{console.log('foo'), 1000})
... will it throw an exception?, will it run and then fail? is the behaviour browser dependent?

Service Workers are only supposed to process attached events.
And those are to be registered by some script from the outside.
Even delaying the execution is not supported in some cases on Safari - Event.waitUntil(promise).
Once your event queue is empty, your user agent is supposed to decide wether it kills of the service. There is no guarantee that anything from then on is going to get executed.

A service worker is not just another thread but a very specific kind of thread. As in it is meant to intercept network and resource fetch requests and do something with it. In its most basic form, it caches if the network is not available, but it can also return a different resource than what was requested, an older version or a placeholder etc.
Eg. if my service worker has ... setInterval (()=>{console.log('foo'),
1000}) ... will it throw an exception? will it run and then fail? is
the behaviour browser dependent?
It will likely work. However, there is very little point of doing this since you neither have any DOM access nor can you directly interact with the user. At most, you can print out errors and warnings though I don't know what warning would require interval polling.
From the question, it sounds you are trying to accomplish some background work without blocking the main thread. In which case, the more generic type (the Worker API) is your friend.

How do I broadcast messages within a network with Javascript?

The question is fairly self explanatory. I want to auto-detect my server software within a local network from a webpage. I'm able to send and receive broadcasts with node, but for this to work I need to be able to send or receive broadcasts with in-browser javascript, and then connect directly to my server.
Does anyone know how to do this? Is there a library for it, or am I out of luck?

I would heartily recommend that you take a look at coreos/etcd, hashicorp/consul or some other service discovery solution which exposes an HTTP interface and JSON data about the location of your services.
Since you cannot access the underlying networking devices from the browser (imagine if I could start probing SO's internal network from my external location), arguably, it takes as much time to set up as it would for you to write a proper Node.js application to discover resources on your network and expose these via JSON to your clients, but using proper service discovery solutions means you can take this to any kind of networking configuration your applications may be running in tomorrow under any kind of circumstances they might find themselves in whilst running (fiber optic cables got cut out between two centers, something hard fell down and broke the switch, something monopolized all the network bandwidth, the IP address of the service changes intermittently, etc.).

Service Worker vs Shared Worker

What is the difference between Service Worker and Shared Worker?
When should I use Service Worker instead of Shared Worker and vice versa?

A service worker has additional functionality beyond what's available in shared workers, and once registered, they persist outside the lifespan of a given web page.
Service workers can respond to message events, like shared workers, but they also have access to additional events. Handling fetch events allows service workers to intercept any network traffic (originating from a controlled page) and take specific actions, including serving responses from a Request/Response cache. There are also plans to expose a push event to service workers, allowing web apps to receive push messages in the "background".
The other major difference relates to persistence. Once a service worker is registered for a specific origin and scope, it stays registered indefinitely. (A service worker will automatically be updated if the underlying script changes, and it can be either manually or programmatically removed, but that's the exception.) Because a service worker is persistent, and has a life independent of the pages active in a web browser, it opens the door for things like using them to power the aforementioned push messaging—a service worker can be "woken up" and process a push event as long as the browser is running, regardless of which pages are active. Future web platform features are likely to take advantage of this persistence as well.
There are other, technical differences, but from a higher-level view, those are what stand out.

A SharedWorker context is a stateful session and is designed to multiplex web pages into a single app via asynchronous messaging (client/server paradigm). Its life cycle is domain based, rather than single page based like DedicatedWorker (two-tier paradigm).
A ServiceWorker context is designed to be stateless. It actually is not a persistent session at all - it is the inversion of control (IoC) or event-based persistence service paradigm. It serves events, not sessions.
One purpose is to serve concurrent secure asynchronous events for long running queries (LRQs) to databases and other persistence services (ie clouds). Exactly what a thread pool does in other languages.
For example if your web app executes many concurrent secure LRQs to various cloud services to populate itself, ServiceWorkers are what you want. You can execute dozens of secure LRQs instantly, without blocking the user experience. SharedWorkers and DedicatedWorkers are not easily capable of handling many concurrent secure LRQs. Also, some browsers do not support SharedWorkers.
Perhaps they should have called ServiceWorkers instead: CloudWorkers for clarity, but not all services are clouds.
Hopefully this explanation should lead you to thinking about how the various Worker types were designed to work together. Each has its own specialization, but the common goal is to reduce DOM latency and improve user experience in web based applications.
Throw in some WebSockets for streaming and WebGL for graphics and you can build some smoking hot web apps that perform like multiplayer console games.

2022 06 Update
WebKit added support for the SharedWorker recently, see the details of the resolution in the issue link mentioned below.
2020 11 Update
Important detail for anyone interested in this discussion: SharedWorker is NOT supported by WebKit (was intentionally removed ~v6 or something).
WebKit team are explicitly suggesting to use ServiceWorker wherever SharedWorker might seem relevant.
For a community wish to get this functionality back to WebKit see this (unresolved as of now) issue.

Adding up to the previous great answers.
As the main difference is that ServiceWorker is stateless (will shut down and then start with clear global scope) and SharedWorker will maintain state for the duration of the session.
Still there is a possibility to request that ServiceWorker will maintain state for the duration of a message handler.
s.onmessage = e => e.waitUntil((async () => {
// do things here
// for example issue a fetch and store result in IndexedDb
// ServiceWorker will live till that promise resolves
})())
The above code requires that the ServiceWorker will not shut down till the promise given as the parameter to waitUntil resolves. If many messages are handled concurrently in that manner ServiceWorker will not shut down untill all promises are resolved.
This could be possibly used to prolong ServiceWorker life indefinitely making it effectively a SharedWorker. Still, do keep in mind that browser might decide to force a shut down if ServiceWorker goes on fo too long.

Securing AJAX API

I have an API (1) on which I have build an web application with its own AJAX API (2). The reason for this is not to expose the source API.
However, the web application uses AJAX (through JQuery) go get new data from its AJAX API, the data retrieved is currently in XML.
Lately I have secured the main API (1) with an authorization algorithm. However, I would like to secure the web application as well so it cannot be parsed. Currently it is being parsed to get the hash used to call the AJAX API, which returns XML.
My question: How can I improve the security and decrease the possibility of others able to parse my web application.
The only ideas I have are: stop sending XML, but send HTML instead. Use flash (yet, this is not an option).
I understand that since the site is public, and no login can be implemented, it can be hard to refuse access to bots (non legitimate users). Also, Flash is not an option... it never is ;)
edit
The Web Application I am referring to: https://bikemap.appified.net/

This is somewhat of an odd request; you wish to lock down a system that your own web application depends on to work. This is almost always a recipe for disaster.
Web applications should always expect to be sidelined, so the real security must come from the server; tarpitting, session tokens, throttling, etc.
If that's already in place, I don't see any reason why should jump through hoops in your own web application to give robots a tougher time ... unless you really want to separate humans from robots ;-)
One way to reduce the refactoring pain on your side is to wrap the $.ajax function in a piece of code that could sign the outgoing requests (or somehow add fields to it) ... then minify / obscurify that code and hope it won't get decoded so fast.

We Keep Coding

JavaScript is the programming language of the Web.