Is it possible to use a Chrome Extension to change document.referrer?

Is it possible to use a Chrome Extension to change document.referrer? - javascript

I want my extension to change the document.referrer of certain webpages, but searching for this leads to documentation on how to change the Referrer in the HTTP request headers.
Is there any way to use an extension to change document.referrer, preferably before the javascript(s) of a webpage loads?
Currently I refresh pages where I want the document.referrer altered with window.location.replace(window.location.href), but this generates a noticeable flicker, and only can change the referrer to the current page.

override.js:
Object.defineProperty(Document.prototype, 'referrer', {
get() {
return 'foo';
},
});
This should be put in page context at document_start, there are several methods.
For ManifestV3 the most reliable method is registerContentScripts: example. Specify the correct pattern in matches and optionally add allFrames: true; matchOriginAsFallback: true if the site uses frames.
Note that due to a bug in Chrome the sites can extract the original getter and call it on the main document to get the real referrer. The workarounds are mentioned in the report and aren't simple.

Related

Is there an alternative to preprocessorScript for Chrome DevTools extensions?

I want to create a custom profiler for Javascript as a Chrome DevTools Extension. To do so, I'd have to instrument all Javascript code of a website (parse to AST, inject hooks, generate new source). This should've been easily possible using chrome.devtools.inspectedWindow.reload() and its parameter preprocessorScript described here: https://developer.chrome.com/extensions/devtools_inspectedWindow.
Unfortunately, this feature has been removed (https://bugs.chromium.org/p/chromium/issues/detail?id=438626) because nobody was using it.
Do you know of any other way I could achieve the same thing with a Chrome Extension? Is there any other way I can replace an incoming Javascript source with a changed version? This question is very specific to Chrome Extensions (and maybe extensions to other browsers), I'm asking this as a last resort before going a different route (e.g. dedicated app).

Use the Chrome Debugging Protocol.
First, use DOMDebugger.setInstrumentationBreakpoint with eventName: "scriptFirstStatement" as a parameter to add a break-point to the first statement of each script.
Second, in the Debugger Domain, there is an event called scriptParsed. Listen to it and if called, use Debugger.setScriptSource to change the source.
Finally, call Debugger.resume each time after you edited a source file with setScriptSource.
Example in semi-pseudo-code:
// Prevent code being executed
cdp.sendCommand("DOMDebugger.setInstrumentationBreakpoint", {
eventName: "scriptFirstStatement"
});
// Enable Debugger domain to receive its events
cdp.sendCommand("Debugger.enable");
cdp.addListener("message", (event, method, params) => {
// Script is ready to be edited
if (method === "Debugger.scriptParsed") {
cdp.sendCommand("Debugger.setScriptSource", {
scriptId: params.scriptId,
scriptSource: `console.log("edited script ${params.url}");`
}, (err, msg) => {
// After editing, resume code execution.
cdg.sendCommand("Debugger.resume");
});
}
});
The implementation above is not ideal. It should probably listen to the breakpoint event, get to the script using the associated event data, edit the script and then resume. Listening to scriptParsed and then resuming the debugger are two things that shouldn't be together, it could create problems. It makes for a simpler example, though.

On HTTP you can use the chrome.webRequest API to redirect requests for JS code to data URLs containing the processed JavaScript code.
However, this won't work for inline script tags. It also won't work on HTTPS, since the data URLs are considered unsafe. And data URLs are can't be longer than 2MB in Chrome, so you won't be able to redirect to large JS files.
If the exact order of execution of each script isn't important you could cancel the script requests and then later send a message with the script content to the page. This would make it work on HTTPS.
To address both issues you could redirect the HTML page itself to a data URL, in order to gain more control. That has a few negative consequences though:
Can't reload page because URL is fixed to data URL
Need to add or update <base> tag to make sure stylesheet/image URLs go to the correct URL
Breaks ajax requests that require cookies/authentication (not sure if this can be fixed)
No support for localStorage on data URLs
Not sure if this works: in order to fix #1 and #4 you could consider setting up an HTML page within your Chrome extension and then using that as the base page instead of a data URL.
Another idea that may or may not work: Use chrome.debugger to modify the source code.

Load external page and Replace text

Would it be possible to load an external page inside a container and replace text elements?
We work with ad campaigns and earn a percentage whenever a user signs up.
Can a script replace certain words? For instance “User” to “Usuario” or “Password” to “Contraseña” without affecting the original website or its functions.
Note: These links always pass through a redirection.
Example:
http://a2g-secure.com/?E=/0yTeQmWHoKOlN6zUciCXQwUzfnVGPGN&s1=
Note 2: Using an iframe is out of the question due to “Same-origin policy”.

I'm not sure if this answers your question, but you might find it useful.
(Perhaps you might give a step-by-step example of what you're trying to accomplish?)
If we assume that a browser attempts to retrieve page P from a proxy which first retrieves the content of page P from its actual home and then performs some transformation on its content before returning that page content to the browser, what you're describing is a Reverse HTTP Proxy and is a very well-known page serving technique.
Rather than performing complex transformations at the server (which require specialized knowledge of the page layout), this technique is usually used to inject a single line into the retrieved source that calls a JavaScript file to actually perform the required transformation at the browser.
So in essence:
Browser requests Page P from Proxy 1.
Proxy 1 retrieves the actual Page P from its real home, Server 2.
Proxy 1 adds the line <script src="//proxy1.com/transform.js"></script> to the source of Page P.
Proxy 1 then returns the modified source of Page P to Browser.
Once the Browser has received the page content, the JavaScript file is also retrieved, which can then modify the page contents in any way required.
This technique can be used to solve your "Same origin policy" issue by loading an iframe from a URL that points to the same server as that which provided the parent or owning page of the iframe which acts as proxy, like:
http://example.com/?proxy_target=//server2.com/pageP.html
Thus, the browser only "sees" content from a single server.

You would need to load the external page server-side, and then you can do whatever you want with it. You can do serverside string replacement, or you can do it later in javascript.
But, remember that as soon as you add a whole webpage into for example a div in your own page, the css from your page will affect it.
Plus, you would need to manipulate all the links in the documents, to have absolute urls. If the page depends on ajax, there is pretty much no way to accomplish what you want to do.
If on the other hand the pages you will be loading are static html, it is possible, though there are a lot of things you need to take care of before you can actually present the page to the user, like adjusting links, urls to stylesheets and so on.

It seems you are trying to localize a website on the fly, using your server as a proxy for that content. Does it make sense? If that's the case, depending on the size of your operation, there are several proxy translation services out there (I'll name them if needed).
Basically, they scrape a website, providing a way for you to translate and host the translated content. Of course, this depends on your relationship with the content providers. You should also take this into consideration, since modifying content, even for translation, can be a copyright problem.
All things considered, if you trust the provider's javascript, the solution involves scraping the content, as mentioned in other answers, and serving that modified content. You really need to trust the origin...
update per request
http://www.easyling.com
http://www.smartling.com
http://www.motionpoint.com
http://www.lionbridge.com/solutions/translation-proxy/
http://www.sajan.com/translation-proxy-technology-and-traditional-website-translation-understanding-your-options/
They are all aimed at enterprise-grade projects, but I would say Easyling is the most accessible.
Hope this helps.

Using the .load() callback function, this will replace the text
$(function(){
$("#Content").load("http://example.com?user=Usuario",function() {
$(this).html($(this).html().replace("user", +get param value+));
});
redirection u can use
// similar behavior as an HTTP redirect
window.location.replace("url");
// similar behavior as clicking on a link
window.location.href = "url";

The answer is NO, not without using a server-side proxy. For a really good overview of how to use a proxy, see this YUI page: https://developer.yahoo.com/javascript/howto-proxy.html (Be patient, as it will take time to load, but the illustrations are worth it!)
When I try to do this in jsfiddle to see what data that the 3 parameters contain, then the error below appears:
$(function() {
$(this).load('https://stackoverflow.com/questions/36003367/load-external-page-and-replace-text', function(responseText, textStatus, jqXHR){
debugger;
});
});
ERROR:
XMLHttpRequest cannot load Load external page and Replace text.
No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://fiddle.jshell.net' is therefore not allowed access.

Can I use indexeddb across subdomains?

I'm building a Chrome extension and using the db.js wrapper to utilize the indexeddb. The problem is, I've got several subdomains and I'd like to be able to share the information across them.
When I use the Chrome Dev tools to view Resources, all of the individual subdomains have their own copy of the schema I'm creating, and each has it's own data.
The only thing I knew to try was to set the document.domain but that didn't help. I wasn't surprised.
Documentation on indexeddb is very slim it seems. I keep finding the same 2 or 3 blog posts copied word for word in several different blogs and nothing specifies that this is possible or impossible.

You can't access the same database from multiple subdomains, the access scope is limited to html origin.
html_Origin = protocol + "://" + hostname + ":" + port + "/";

As #Xan mentioned, if you can use a common origin owned by the extension itself, rather than by the content pages, that sounds like it would be by far the easiest solution. If for whatever reason you can't do that (or for readers who got here wanting to know about regular page javascript or Greasemonkey-style userscripts, rather than extensions), the answer is:
Yes, though it's a slightly awkward and takes some work:
Since you're using a number of related subdomains, (rather than completely unrelated domains), there's a technique you can use in that situation. It can be applied to IndexedDB, localStorage, SharedWorker, BroadcastChannel, etc, all of which offer shared functionality between same-origin pages, but for some reason don't respect modifications to document.domain.
(1) Pick one "main" subdomain to for the data to belong to. i.e. if your subdomains are https://a.example.com, https://b.example.com, and https://c.example.com, you might choose to have your IndexedDB database stored under the https://a.example.com subdomain.
(2) Use it normally from all the the https://a.example.com pages.
(3) On https://b.example.com and https://c.example.com, use javascript to set document.domain = "example.com";. Then also create a hidden <iframe>, and navigate it to some page on the https://a.example.com domain (It doesn't matter what page, as long as you can insert a very little snippet of javascript on there. If you're creating the site, just make an empty page specifically for this purpose. If you're writing an extension or a userscript and so don't have any control over pages on the example.com server, just pick the most lightweight page you can find and insert your script into it. Some kind of "not found" page would probably be fine).
(4) The script on the hidden iframe page need only (a) set document.domain = "example.com";, and (b) notify the parent window when this is done. After that, the parent window can access the iframe window and all its objects without restriction! So the minimal iframe page is something like:
<!doctype html>
<html>
<head>
<script>
document.domain = "example.com";
window.parent.iframeReady(); // function defined & called on parent window
</script>
</head>
<body></body>
</html>
If writing a userscript, you might not want to add externally-accessible functions such as iframeReady() to your unsafeWindow, so instead a better way to notify the main window userscript might be to use a custom event:
window.parent.dispatchEvent(new CustomEvent("iframeReady"));
Which you'd detect by adding a listener for the custom "iframeReady" event to your main page's window.
(5) Once the hidden iframe has informed its parent window that it's ready, script in the parent window can just use iframe.contentWindow.indexedDB, iframe.contentWindow.localStorage, iframe.contentWindow.BroadcastChannel, iframe.contentWindow.SharedWorker instead of window.indexedDB, window.localStorage etc. ...and all these objects will be scoped to the https://a.example.com origin - so they'll have the this same shared origin for all of your pages!
The "awkward" part of this technique is mostly that you have to wait for the iframe to load before proceeding. So you can't just blithely initialize IndexedDB in your DOMContentLoaded handler, for example. Also you might want to add some error handling to detect if the hidden iframe fails to load correctly.
Obviously, you should also make sure the hidden iframe is not removed or navigated during the lifetime of your page... OTOH I don't know what the result of that would be, but very likely bad things would happen.
And, a caveat: setting/changing document.domain can be blocked using the Feature-Policy header, in which case this technique will not be usable as described.
However, there is a significantly more-complicated generalization of this technique, that can't be blocked by Feature-Policy, and that also allows entirely unrelated domains to share data, communications, and shared workers (i.e. not just subdomains off a common superdomain). #Xan alludes to it in point (2) of his answer:
The general idea is that, just as above, you create a hidden iframe to provide the correct origin for access; but instead of then just grabbing the iframe window's properties directly, you use script inside the iframe to do all of the work, and you communicate between the iframe and your main window only using postMessage() and addEventListener("message",...).
This works because postMessage() can be used even between different-origin windows. But it's also significantly more complicated because you have to pass everything through some kind of messaging infrastructure that you create between the iframe and the main window, rather than (for example) just using the IndexedDB API directly in your main window's code.

HTML-based storage (indexedDB, localStorage) in Chrome extensions behaves in a way that might not be expected, but it's perfectly natural.
In the background page, the domain is chrome-extension://yourextensionid/, and this is shared by all extension pages and is persistent.
In the content scripts though, you're sharing the HTML storage with the domain you're operating on. This makes life difficult if you want it to share/persist things. Note that sometimes this behavior is actually helpful.
The universal solution is to keep the DB in a background script, and communicate data/requests by means of Messaging API.
This was the usual solution for localStorage use until chrome.storage came along. But since you're using a database, you don't have a ready extension-friendly replacement.

transferring localstorage to another website [duplicate]

I am attempting to share data across subdomains using Safari. I would like to use an HTML5 database (specifically localStorage as my data is nothing but key-value pairs).
However, it seems as though data stored to example.com can not be accessed from sub.example.com (or vice versa). Is there any way to share a single database in this situation?

Update 2016
This library from Zendesk worked for me.
Sample:
Hub
// Config s.t. subdomains can get, but only the root domain can set and del
CrossStorageHub.init([
{origin: /\.example.com$/, allow: ['get']},
{origin: /:\/\/(www\.)?example.com$/, allow: ['get', 'set', 'del']}
]);
Note the $ for matching the end of the string. The regular expression in the above example will match origins such as valid.example.com, but not invalid.example.com.malicious.com.
Client
var storage = new CrossStorageClient('https://store.example.com/hub.html');
storage.onConnect().then(function() {
return storage.set('newKey', 'foobar');
}).then(function() {
return storage.get('existingKey', 'newKey');
}).then(function(res) {
console.log(res.length); // 2
}).catch(function(err) {
// Handle error
});
Check https://stackoverflow.com/a/39788742/5064633

There is simple way to use cross-domain anything, just create simple page that will be included as proxy iframe hosted on domain you try to access, send PostMessage to that iframe and inside iframe you do your LocalStorage database manipulation. Here is a link to article that do this with lcoalStorage. And here is demo that send message to different page in subdomain check the source code, it use iframe and PostMessage.
EDIT: New version of sysend.js library (used by above demo) use BroadcastChannel if browser support it, but still it require Iframe. Recent version also simplify using of Cross-Origin messages, you have html of the iframe in repo, that you can use (or you can use simple html file with single script tag with the lib) and in parent you just need to call one function sysend.proxy('https://example.com'); where example.com need to have proxy.html file (you can also use your own filename and different path).

Google Chrome blocks localStoage access from an iFrame in another domain by default,unless 3rd party cookie is enabled and so does Safari on iPhone...the only solution seems to be opening the parent domain on a different domain and then sending to to the Child via window.postMessage but looks ugly and shifty on phones...

Yes. This is how:
For sharing between subdomains of a given superdomain (e.g. foo.example.com vs bar.example.com vs example.com), there's a technique you can use in that situation. It can be applied to localStorage, IndexedDB, SharedWorker, BroadcastChannel, etc, all of which offer shared functionality between same-origin pages, but for some reason don't respect any modification to document.domain that would let them use the superdomain as their origin directly.
NOTE: This technique depends on setting document.domain to allow direct communication between iframes on different subdomains. That functionality has now been deprecated. (As of April 2021 it continues to work in all major browsers however. From Chrome v109 the feature will be disabled unless an Origin-Agent-Cluster: ?0 header is also sent.)
NOTE: Be aware that this technique removes the same-origin defences that block malicious script on a subdomain from affecting the main-domain window, or visa versa, potentially broadening the attack surface for XSS attacks. There are other security implications for shared hosting as well - see the MDN document.domain page for details.
(1) Pick one "main" domain to for the data to belong to: i.e. either https://foo.example.com or https://bar.example.com or https://example.com will hold your localStorage data. Let's say you pick https://example.com.
(2) Use localStorage normally for that chosen domain's pages.
(3) On all other https://*.example.com pages (the other domains), use JavaScript to set document.domain = "example.com"; (always the superdomain). Then also create a hidden <iframe>, and navigate it to some page on the chosen https://example.com domain (It doesn't matter what page, as long as you can insert a very little snippet of JavaScript on there. If you're creating the site, just make an empty page specifically for this purpose. If you're writing an extension or a Greasemonkey-style userscript and so don't have any control over pages on the example.com server, just pick the most lightweight page you can find and insert your script into it. Some kind of "not found" page would probably be fine).
(4) The script on the hidden iframe page need only (a) set document.domain = "example.com";, and (b) notify the parent window when this is done. After that, the parent window can access the iframe window and all its objects without restriction! So the minimal iframe page is something like:
<!doctype html>
<html>
<head>
<script>
document.domain = "example.com";
window.parent.iframeReady(); // function defined & called on parent window
</script>
</head>
<body></body>
</html>
If writing a userscript, you might not want to add externally-accessible functions such as iframeReady() to your unsafeWindow, so instead a better way to notify the main window userscript might be to use a custom event:
window.parent.dispatchEvent(new CustomEvent("iframeReady"));
Which you'd detect by adding a listener for the custom "iframeReady" event to your main page's window.
(NOTE: You need to set document.domain = example.com even if the iframe's domain is already example.com: Assigning a value to document.domain implicitly sets the origin's port to null, and both ports must match for the iframe and its parent to be considered same-origin. See the note here: https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy#Changing_origin)
(5) Once the hidden iframe has informed its parent window that it's ready, script in the parent window can just use iframe.contentWindow.localStorage, iframe.contentWindow.indexedDB, iframe.contentWindow.BroadcastChannel, iframe.contentWindow.SharedWorker instead of window.localStorage, window.indexedDB, etc. ...and all these objects will be scoped to the chosen https://example.com origin - so they'll have the this same shared origin for all of your pages!
The most awkward part of this technique is that you have to wait for the iframe to load before proceeding. So you can't just blithely start using localStorage in your DOMContentLoaded handler, for example. Also you might want to add some error handling to detect if the hidden iframe fails to load correctly.
Obviously, you should also make sure the hidden iframe is not removed or navigated during the lifetime of your page... OTOH I don't know what the result of that would be, but very likely bad things would happen.
And, a caveat: setting/changing document.domain can be blocked using the Feature-Policy header, in which case this technique will not be usable as described.
However, there is a significantly more-complicated generalization of this technique, that can't be blocked by Feature-Policy, and that also allows entirely unrelated domains to share data, communications, and shared workers (i.e. not just subdomains off a common superdomain). #jcubic already described it in their answer, namely:
The general idea is that, just as above, you create a hidden iframe to provide the correct origin for access; but instead of then just grabbing the iframe window's properties directly, you use script inside the iframe to do all of the work, and you communicate between the iframe and your main window only using postMessage() and addEventListener("message",...).
This works because postMessage() can be used even between different-origin Windows. But it's also significantly more complicated because you have to pass everything through some kind of messaging infrastructure that you create between the iframe and the main window, rather than just using the localStorage, IndexedDB, etc. APIs directly in your main window's code.

Is it possible to use jQuery to grab the HTML of another web page into a div?

I am trying to integrate with the FireShot API to given a URL, grab HTML of another web page into a div then take a screenshot of it.
Some things I will need to do after getting the HTML
grab <link> & <script> from <head>
grab <body> into <div>
But 1st, it seems when I try to do a
$.get("http://google.com", function(data) { ... });
I get a 200 in firebug colored red. I think it has to do with sites not allowing you to grab their page with JS? Then is opening a window the best I can do? But how might I control the other page with jQuery or call fsapi on that page?
UPDATE
I tried to do something like below to do something when the new window is ready, but FireBug says "Permission denied to access property 'document'"
w = window.open($url.val());
setTimeout(function() { // if I dont do this, I always get about:blank, is there a better way around this?
$(w.document).ready(function() {
console.log(w.document.body);
});
}, 1000);

I believe the cross-site security setup within Javascript is basically blocking this. You'd likely have to proxy the content through your own domain.
There are a couple other options I think for break the cross-site security constraints, but I'm not sure I'd promote them.

If the "another page" locates within the same domain of your hosting page, yes, you can. Please refer to jQuery's $().load() API.
Otherwise, you're disallowed to do so by the browser's Cross-Site Security Policy. At this moment, you can choose to use iFrame instead of DIV.
Some jQuery plugins, e.g. thickbox provides ability to load pages to appropriate container automatically.

Unless I am correct, I do not believe you can AJAX a page cross domain (e.g. from domain1.com to domain2.com). To get around this, you can have a PHP "proxy" script that does the "getting" of the page and then pass it to JS.
For example, in JS you would get() http://mydomain.com/get/?domain=http://google.com and then do what you need to do!

We Keep Coding

JavaScript is the programming language of the Web.