How to control the XMLHttpRequest object on an HTML5 Web Worker? - javascript

I have a page which will normally overrides window.XMLHttpRequest with a wrapper that does a few extra things like inserting in headers on certain requests.
I have some functionality in a 3rd party library that uses HTML5 Worker, and we are seeing that this request does not use the XMLHttpRequest wrapper object. So any request that this library makes is missing the required headers, and so the request will fail.
Is there a way to control the XMLHttpRequest that any Worker the current thread creates?
This 3rd party library code looks like this:
function createWorker(url) {
var worker = new Worker(url);
worker.onmessage = function (e) {
if (e.data.status) {
onprogress(e.data.status);
} else if (e.data.error) {
onerror(e.data.error);
} else {
exportUtils.saveFile(new Blob([e.data]), params.fileName);
onfinish();
}
};
worker.postMessage(params); // window.location.origin +
return worker;
}
The Javascript that is returned by the URL variable above contains code like this:
return new Promise(function(t, r) {
var n = new XMLHttpRequest
, a = "batch_" + o()
, u = e.dataUrl.split(e.serviceUrl)[1]
, c = [];
n.onload = function() {
for (var e = this.responseText, n = this.responseText.split("\r\n"), o = 0, a = n.length, i = a - 1; o < a && "{" !== n[o].slice(0, 1); )
o++;
for (; i > 0 && "}" !== n[i].slice(-1); )
i--;
n = n.slice(o, i + 1),
e = n.join("\r\n");
try {
var u = JSON.parse(e);
t(u)
} catch (t) {
r(s + e)
}
}
,
n.onerror = function() {
r(i)
}
,
n.onabort = function() {
r(i)
}
,
n.open("POST", e.serviceUrl + "$batch", !0),
n.setRequestHeader("Accept", "multipart/mixed"),
n.setRequestHeader("Content-Type", "multipart/mixed;boundary=" + a);
for (var p in e.headers)
"accept" != p.toLowerCase() && n.setRequestHeader(p, e.headers[p]);
c.push("--" + a),
c.push("Content-Type: application/http"),
c.push("Content-Transfer-Encoding: binary"),
c.push(""),
c.push("GET " + u + " HTTP/1.1");
for (var p in e.headers)
c.push(p + ":" + e.headers[p]);
c.push(""),
c.push(""),
c.push("--" + a + "--"),
c.push(""),
c = c.join("\r\n"),
n.send(c)
}
)

The answer is both a soft "no" and an eventual "yes".
When a piece of code runs in a different context (like a webworker or an iframe), you do not have direct control of its global object (1).
What's more, XMLHttpRequest isn't the only way to send out network requests - you have several other methods, chief among them the fetch api.
However, there's a relatively new kid in block called Service Workers which can help you quite a bit!
Service workers
Service workers (abbrev. SWs) are very much like the web workers you already know, but instead of only running in the current page, they continue to run in the background as long as your user stays in your domain. They are also global to your entire domain, so any request made from your site will be passed through them.
Their main purpose in life is reacting to network requests, usually used for caching purposes and offline content, serving push notifications, and several other niche uses.
Let's see a small example (note, run these from a local webserver):
// index.html
<script>
navigator.serviceWorker.register('sw.js')
.then(console.log.bind(console, 'SW registered!'))
.catch(console.error.bind(console, 'Oh nose!'));
setInterval(() => {
fetch('/hello/');
}, 5000);
</script>
// sw.js
console.log('Hello from a friendly service worker');
addEventListener('fetch', event => {
console.log('fetch!', event);
})
Here we're registering a service worker and then requesting a page every 5 seconds. In the service worker, we're simple logging each network event, which can be caught in the fetch event.
On first load, you should see the service worker being registered. SWs only begin intercepting requests from the first page after they were installed...so refresh the page to begin seeing the fetch events being logged. I advise you to play around with the event properties before reading on so things will be clearer.
Cool! We can see from poking around with the event in the console that event.request is the Request object our browser constructed. In an ideal world, we could access event.request.headers and add our own headers! Dreamy, isn't it!?
Unfortunately, request/response headers are guarded and immutable. Fortunately, we are a stubborn bunch and can simply re-construct the request:
// sw.js
console.log('Hello from a friendly service worker');
addEventListener('fetch', event => {
console.log('fetch!', event);
// extract our request
const { request } = event;
// clone the current headers
const newHeaders = new Headers();
for (const [key, val] of request.headers) {
newHeaders.append(key, val);
}
// ...and add one of our own
newHeaders.append('Say-What', 'You heard me!');
// clone the request, but override the headers with our own
const superDuperReq = new Request(request, {
headers: newHeaders
});
// now instead of the original request, our new request will take precedence!
return fetch(superDuperReq);
});
This is a few different concepts at play so it's okay if it takes more than once to get. Essentially though, we're creating a new request which will be sent in place of the original one, and setting a new header! Hurray!
The Bad
Now, to some of the downsides:
Since we're hijacking every single request, we can accidentally change requests we didn't mean to and potentially destroy the entire universe!
Upgrading SWs is a huge pain. SW lifecycle is complex, debugging it on your users is difficult. I've seen a good video on dealing with it, unfortunately can't find it right now, but mdn has a fairly good description
Debugging SWs is often a very annoying experience, especially when combined with their weird lifecycles
Because they are so powerful, SWs can only be served over https. You should already be using https anyway, but this is still a hindrance
This is a lot of things to do for a relatively small benefit, so maybe reconsider its necessity
(1) You can access the global object of an iframe in the same origin as you, but getting your code to run first to modify the global object is tricky indeed.

Related

Does anyone know how to define navigator online in main process in electron?

I know you can use navigator onLine inside the renderer process because it's a rendered inside a browser. But what I'm trying to do is something like this in the main process:
if (navigator.onLine){
mainWindow.loadURL("https://google.com")
} else {
mainWindow.loadFile(path.join(__dirname, 'index.html'));
}
So basically if the user is offline, just load a local html file, and if they're online, take them to a webpage. But, like expected, I keep getting the error that 'navigator is not defined'. Does anyone know how can I somehow import the navigate cdn in the main process? Thanks!
TL;DR: The easiest thing to do is to just ask Electron. You can do this via the net module from within the Main Process:
const { net } = require ("electron");
const isInternetAvailable = () => return net.isOnline ();
// To check:
if (isInternetAvailable ()) { /* do something... */ }
See Electron's documentation on the method; specifically, this approach doesn't tell you whether your service is accessible via the internet, but rather that a service can be contacted (or not even this, as the documentation mentions links which would not involve any HTTP request at all).
However, this is not a reliable measurement and you might want to increase its hit rate by manuallly checking whether a certain connection can be made.
In order to check whether an internet connection is available, you'll have to make a connection yourself and see if it fails. This can be done from the Main Process using plain NodeJS:
// HTTP code basically from the NodeJS HTTP tutorial at
// https://nodejs.dev/learn/making-http-requests-with-nodejs/
const https = require('https');
const REMOTE_HOST = "google.com"; // Or your domain
const REMOTE_EP = "/"; // Or your endpoint
const REMOTE_PAGE = "https://" + REMOTE_HOST + REMOTE_EP;
function checkInternetAvailability () {
return new Promise ((resolve, reject) => {
const options = {
hostname: REMOTE_HOST,
port: 443,
path: REMOTE_EP,
method: 'GET',
};
// Try to fetch the given page
const req = https.request (options, res => {
// Yup, that worked. Tell the depending code.
resolve (true);
req.destroy (); // This is no longer needed.
});
req.on ('error', error => {
reject (error);
});
req.on ('timeout', () => {
// No, connection timed out.
resolve (false);
req.destroy ();
});
req.end ();
});
}
// ... Your window initialisation code ...
checkInternetAvailability ().then (
internetAvailable => {
if (internetAvailable) mainWindow.loadURL (REMOTE_PAGE);
else mainWindow.loadFile (path.join (__dirname, 'index.html'));
// Call any code needed to be executed after this here!
}
).catch (error => {
console.error ("Oops, couldn't initialise!", error);
app.quit (1);
});
Please note that this code here might not be the most desirable since it just "crashes" your app with exit code 1 if there is any error other than connection timeout.
This, however, makes your startup asynchronous, which means that you need to pay attention on the execution chain of your app startup. Also, startup may be really slow in case the timeout is reached, it may be worth considering NodeJS' http module documentation.
Also, it makes sense to actually try to retrieve the page you're wanting to load in the BrowserWindow (constant values REMOTE_HOST and REMOTE_EP), because that also gives you an indication whether your server is up or not, although that means that the page will be fetched twice (in the best case, when the connection test succeeds and when Electron loads the page into the window). However, that should not be that big of a problem, since no external assets (images, CSS, JS) will be loaded.
One last note: This is not a good metric of whether any internet connection is available, it just tells you whether your server answered within the timeout window. It might very well be that any other service works or that the connection just is very slow (i.e., expect false negatives). Should be "good enough" for your use-case though.

Service worker: caching cross-domain response with all static assets as well

I am currently working on a proof-of-concept for a project but I just can't get my head around
documentation
and implementation :)
The case is as follows:
I have my main app (React) that has a list of links. All of them link to a specific page.
These links open up in an iframe.
That's all basically.
So my app runs on "app.domain.com" and the forms urls are like "pages.domain.com/pages/pageA.html" etc.
What I need to do in this poc is to make these pages offline available, including(!) the assets for this pages (css/js/img)
I already created a simple service worker.
const CACHE_NAME = "poc-forms";
self.addEventListener("install", (event) => {
console.log("sw installing…");
});
self.addEventListener("activate", (event) => {
console.log("sw now ready to handle fetches");
event.waitUntil(caches.open(CACHE_NAME).then(() => self.clients.claim()));
});
self.addEventListener("fetch", (event) => {
const url = new URL(event.request.url);
if (url.pathname.includes("forms")) {
event.respondWith(
(async function () {
var cache = await caches.open(CACHE_NAME);
var cachedFiles = await cache.match(event.request);
if (cachedFiles) {
return cachedFiles;
} else {
try {
var response = await fetch(event.request);
await cache.put(event.request, response.clone());
return response;
} catch (e) {
console.log(" something when wrong!");
}
}
})()
);
}
});
It fetches the request and checks if it's already in the cache or not.
If it's not, cache it.
This works.
But where I'm stuck:
how can I also store the css and js that are needed for the pages as well? Do I need find a way to get all links and loop over them, fetch them and store them?
I heard about Google Workbox, also went through the documentation but it's just not clear to me, like how to transform my current SW into something that works with workbox, with the whole registerRoute-thing on a fetch...
the service worker will only capture the fetches when the page is refreshed. The clients.claim() should fix this, but it doesn't actually...
If someone out there could help me out with this, much appreciated!
thanks,
Mario
The Service Worker APIs do not allow for a service worker registered on app.domain.com to control either navigation requests or subresource requests associated with an <iframe> that is hosted on pages.domain.com. That's not something that Workbox can help with—it's just not possible.
From the perspective of the service worker, an <iframe> with a different origin than the host page is equivalent to a new window or tab with a different origin. The service worker can't "see" or take control of it.
Also, this might help others, small variant of what I want to accomplish:
Why isn't my offline cached subdomain page not loading when called from other subdomain?
Ok, we ended up using a different approach, because the app we are loading inside the iframe exposed an api apparently, where we can hook into.
To cache the entire page, with all the assets, this was the code that worked for me:
importScripts(
"https://storage.googleapis.com/workbox-cdn/releases/5.1.2/workbox-sw.js"
);
const { skipWaiting, clientsClaim } = workbox.core;
const { registerRoute } = workbox.routing;
const { StaleWhileRevalidate } = workbox.strategies;
skipWaiting();
clientsClaim();
registerRoute(
({ request }) => {
console.log(" request ", request.destination);
return (
request.destination === "iframe" ||
request.destination === "document" ||
request.destination === "image" ||
request.destination === "script" ||
request.destination === "style" ||
request.destination === "font"
);
},
new StaleWhileRevalidate()
);

Inspecting multiple WebSocket connections at the same time

I am using solutions provided in following topics to inspect WebSockets traffic (messages) on the web page, which I do not own (solely for learning purposes):
Inspecting WebSocket frames in an undetectable way
Listening to a WebSocket connection through prototypes
https://gist.github.com/maskit/2252422
Like this:
(function(){
var ws = window.WebSocket;
window.WebSocket = function (a, b, c) {
var that = c ? new ws(a, b, c) : b ? new ws(a, b) : new ws(a);
that.addEventListener('open', console.info.bind(console, 'socket open'));
that.addEventListener('close', console.info.bind(console, 'socket close'));
that.addEventListener('message', console.info.bind(console, 'socket msg'));
return that;
};
window.WebSocket.prototype=ws.prototype;
}());
The issue with the provided solutions is that they are listening on only 1 of 3 WebSocket connections ("wss://..."). I am able to see in the console the messages that I receive or send, but only for one connection.. Is there something I am missing? Is it possible that two other service are any different and prohibiting the use of prototype extension technique?
p.s. I will not provide an URL to the web resource that I am doing my tests on, in order to avoid possible bans or legal questions.
Okay, since it's been weeks and no answers, then I will post a solution which I ended up using.
I have built my own Chrome extension that listens to WebSocket connections and forwards all requests and responses to my own WebSocket server (which I happen to run in C#).
There are some limitations to this approach. You are not seeing the request header or who is sending the packets.. You are only able to see the payload and that is it. Also you are not able to modify the contents in any way or send your own requests (remember - you have no access to header metadata). Naturally, another limitation is that you have to be running Chrome (devtools APIs are used)..
Some instructions.
Here is how you attach debugger to listen to network packets:
chrome.debugger.attach({ tabId: tabId }, "1.2", function () {
chrome.debugger.sendCommand({ tabId: tabId }, "Network.enable");
chrome.debugger.onEvent.addListener(onTabDebuggerEvent);
});
Here is how you catch them:
function onTabDebuggerEvent(debuggeeId, message, params) {
var debugeeTabId = debuggeeId.tabId;
chrome.tabs.get(debugeeTabId, function (targetTab) {
var tabUrl = targetTab.url;
if (message == "Network.webSocketFrameSent") {
}
else if (message == "Network.webSocketFrameReceived") {
var payloadData = params.response.payloadData;
var request = {
source: tabUrl,
payload: params.response.payloadData
};
websocket.send(JSON.stringify(request));
}
});
}
Here is how you create a websocket client:
var websocket = new WebSocket("ws://127.0.0.1:13529");
setTimeout(() => {
if (websocket.readyState !== 1) {
console.log("Unable to connect to a WebsocketServer.");
websocket = null;
}
else {
console.log("WebsocketConnection started", websocket);
websocket.onclose = function (evt) {
console.log("WebSocket connection got closed!");
if (evt.code == 3001) {
console.log('ws closed');
} else {
console.log('ws connection error');
}
websocket = null;
};
websocket.onerror = function (evt) {
console.log('ws normal error: ' + evt.type);
websocket = null;
};
}
}, 3000);
Creating the server is outside the scope of this question. You can use one in Node.js, C# or Java, whatever is preferable for you..
This is certainly not the most convenient approach, but unlike java-script injection method - it works in all cases.
Edit: totally forgot to mention. There seems to be another way of solving this, BUT I have not dig into that topic therefore maybe this is false info in some way. It should be possible to catch packets on a network interface level, through packet sniffing utilities. Such as Wireshark or pcap. Maybe something I will investigate further in the future :)

MessageChannel port.postMessage's data is null when calling postMessage with a transferable object?

I'm learning about MessageChannel and transferable objects.
I've got an iframe which is cross-domain from my page. The documentation surrounding MessageChannel indicates that it fully supports cross-domain communications.
I've got this code inside of my cross-domain page inside of an iframe:
var messageChannel = new MessageChannel();
// Transfer port2 to the background page to establish communications.
window.parent.postMessage('connect', 'chrome-extension://jbnkffmindojffecdhbbmekbmkkfpmjd', [messageChannel.port2]);
messageChannel.port1.start();
// Give time for background to setup its port. Not great practice, but OK for example.
setTimeout(function(){
// Create a 32MB "file" and fill it.
var uInt8Array = new Uint8Array(1024*1024*32); // 32MB
for (var i = 0; i < uInt8Array.length; ++i) {
uInt8Array[i] = i;
}
messageChannel.port1.onmessage = function(message){
console.log('iframe message:', message);
};
messageChannel.port1.postMessage(uInt8Array.buffer, [uInt8Array.buffer]);
if (uInt8Array.buffer.byteLength)
throw "Failed to transfer buffer";
}, 1000);
and in my background page I have:
window.onmessage = function(messageEvent) {
// Make sure the origin is correct for security
if (messageEvent.origin === 'https://www.youtube.com') {
if (messageEvent.ports.length > 0 && messageEvent.data === 'connect') {
var port = messageEvent.ports[0];
port.onmessage = function (message) {
console.log("background message:", message);
};
}
}
};
When I attempt to postMessage the uInt8Array buffer -- I receive no data on the other side:
but if I try and send something simple, say:
messageChannel.port1.postMessage('hello');
then I see:
When using transferable objects -- is the data represented somewhere else? I seem to be able to transfer the port just fine, but I'm struggling to transfer the array of data. BUT, since my exception isn't being thrown -- it looks like it IS transferred... but where did it go??
I've reduced your code sample and discovered that the ArrayBuffer is always lost when it is passed through a MessagePort of a MessageChannel.
Reported as issue 334408: "ArrayBuffer is lost in MessageChannel during postMessage (receiver's event.data == null)"
https://code.google.com/p/chromium/issues/detail?id=334408

The request is too large for IE to process properly

I am using Websync3, Javascript API, and subscribing to approximately 9 different channels on one page. Firefox and Chrome have no problems, but IE9 is throwing an alert error stating The request is too large for IE to process properly.
Unfortunately the internet has little to no information on this. So does anyone have any clues as to how to remedy this?
var client = fm.websync.client;
client.initialize({
key: '********-****-****-****-************'
});
client.connect({
autoDisconnect: true,
onStreamFailure: function(args){
alert("Stream failure");
},
stayConnected: true
});
client.subscribe({
channel: '/channel',
onSuccess: function(args) {
alert("Successfully connected to stream");
},
onFailure: function(args){
alert("Failed to connect to stream");
},
onSubscribersChange: function(args) {
var change = args.change;
for (var i = 0; i < change.clients.length; i++) {
var changeClient = change.clients[i];
// If someone subscribes to the channel
if(change.type == 'subscribe') {
// If something unsubscribes to the channel
}else{
}
}
},
onReceive: function(args){
text = args.data.text;
text = text.split("=");
text = text[1];
if(text != "status" && text != "dummytext"){
//receiveUpdates(id, serial_number, args.data.text);
var update = eval('(' + args.data.text + ')');
}
}
});
This error occurs when WebSync is using the JSON-P protocol for transfers. This is mostly just for IE, cross domain environments. Meaning websync is on a different domain than your webpage is being served from. So IE doesn't want do make regular XHR requests for security reasons.
JSON-P basically encodes the up-stream data (your 9 channel subscriptions) as a URL encoded string that is tacked onto a regular request to the server. The server is supposed to interpret that URL-encoded string and send back the response as a JavaScript block that gets executed by the page.
This works fine, except that IE also has a limit on the overall request URL for an HTTP request of roughly 2kb. So if you pack too much into a single request to WebSync you might exceed this 2kb upstream limit.
The easiest solution is to either split up your WebSync requests into small pieces (ie: subscribe to only a few channels at a time in JavaScript), or to subscribe to one "master channel" and then program a WebSync BeforeSubscribe event that watches for that channel and re-writes the subscription channel list.
I suspect because you have a key in you example source above, you are using WebSync On-Demand? If that's the case, the only way to make a BeforeSubscribe event handler is to create a WebSync proxy.
So for the moment, since everyone else is stumped by this question as well, I put a trap in my PHP to not even load this Javascript script if the browser is Internet Destroyer (uhh, I mean Internet Explorer). Maybe a solution will come in the future though.

Categories