I have a page containing another page on the same domain inside a frame. Is it possible to prevent that a script in the framed page can manipulate the top page DOM (for example adding an element or a script)?
You could experiment with getting rid of "dangerous" functions but saving anonymous references to them something like..
(function(){
var hiddenrefs = {};
hiddenrefs.dGetElementById = document.getElementById;
document.getElementById = null;
})();
and so on. However, this would be a very tedious job and bound to fail anyway. If this is an attempt to let users run Javascript in a controlled environment inside an iframe, this is a misguided form of security. The iframe could just issue top.location = "http://www.myevilpage.com" in which case it's game over for you anyway. (This is true even with a different domain. The iframe can still redirect the user and all sorts of nasty stuff, even if it strictly speaking can't access the parent's DOM.) Letting users run JS code is never ever safe without filtering the source code for malicious code, and even with filtering it's fairly unsafe because it's mostly easy to bypass the filtering. Many have tried and many have failed. I'd recommend not letting users run Javascript, ever.
The best solution is probably to use the HTML5 sandbox attribute on the iframe, which (by default) explicitly disables both scripting and same-origin access to the parent DOM.
See http://msdn.microsoft.com/en-us/hh563496.aspx
Related
I am developing an Single Page Application (SPA) from scratch. I am doing it from scratch using only HTML, CSS and vanilla JavaScript and not using any external frameworks.
My application will initially load Web page but upon navigating to some other page say page2, it will only load required data and functions about other page2 from page2.js and not reload the entire Web page.
To use the JavaScript I will append it to body. But the problem is that when I navigate same page again it will append the same JavaScript again. The more pages I visit the more scripts are attached.
I have tried removing existing script tag in favour or upcoming script and it works good, but is there a way that I don't have to append script to DOM in the first place?
So my question is, is there a way we can parse (not just plain read) or execute JavaScript file without using any physical medium (DOM)
Although I am expecting pure JavaScript, libraries would also work, just need a logical explaination
So my question is, is there a way we can parse (not just plain read) or execute JavaScript file without using any physical medium (DOM)
Yes, you can. How you do it depends on how cutting-edge the environment you're going to support is (either natively, or via tools that can emulate some things in older environments).
In a modern environment...
...you could solve this with dynamic import, which is new in ES2020 (but already supported by up-to-date browsers, and emulated by tools like Webpack and Rollup.js). With dynamic import, you'd do something like this:
async function loadPage(moduleUrl) {
const mod = await import(moduleUrl);
mod.main();
}
No matter how many times it's requested, within a realm a module is only loaded once. (Your SPA will be within a realm, so that works.) So the code above will dynamically load the module's code the first time, but just give you back a reference to the already-loaded module the second, third, etc. times. main would be a function you export from the module that tells it you've come (back) to the "page". Your modules might look like this:
// ...code here that only runs once...
// ...perhaps it loads the markup via ajax...
export function main() {
// ...this function gets called very time the user go (back) to our "page"
}
Live example on CodeSandbox.
In older environments...
...two answers for you:
You could use eval...
You can read your code from your server as text using ajax, then evaluate it with eval. You will hear that "eval is evil" and that's not a bad high-level understanding for it. :-) The arguments against it are:
It requires parsing code; some people claim firing up a code parser is "slow" (for some definition of "slow).
It parses and evaluates arbitrary code from strings.
You can see why #2 in particular could be problematic: You have to trust the string you're evaluating. So never use eval on user-supplied content, for instance, in another user's session (User A could be trying to do something malicious with code you run in User B's session).
But in your case, you want and need both of those things, and you trust the source of the string (your server), so it's fine.
But you probably don't need to
I don't think you need that, though, even in older environments. Your code already knows what JavaScript file it needs to load for "page" X, right? So just see whether that code has already been loaded and don't load it again if it is. For instance:
function loadPage(scriptUrl, markupUrl) {
// ...
if (!document.querySelector(`script[src="${scriptUrl}"]`)) {
// ...not found, add a `script` tag for it...
} else {
// ...perhaps call a well-known function to run code that should run
// when you return to the "page"
}
// ...
}
Or if you don't want to use the DOM for it, have an object or Map or Set that you use to keep track of what you've already loaded.
Go back to old-school -- web 1.0, DOM level 1.0, has your back. Something like this would do the trick:
<html><head>
<script>
if (!document.getElementById('myScriptId')) {
document.write('<script id="myScriptId" src="/path/to/myscript"></scri' + 'pt>');
}
</script>
This technique gets everybody upset, but it works great to avoid the problems associated with doing dynamic loading via DOM script tag injection. The key is that this causes the document parser to block until the script has loaded, so you don't need to worry about onload/onready events, etc, etc.
One caveat, pull this trick near the start of your document, because you're going to cause the engine to do a partial DOM reparse and mess up speculative loading.
I'm working on site that uses GTM(Google Tag Manager).
GTM includes some script(tag) from a site that not allowed in my country. It causes an error in console and I want to stop loading this tag. I don't have access to GTM account so I should do it with js. This script is Custom HTML Tag because when I try below code it stops loading:
dataLayer = [{
'gtm.blacklist':['html']
}];
but it also stops loading other custom tags.
How can I stop loading certain custom tag programmatically?
I think it can be done. Looking into GTM code we can see it uses insertBefore function to add Script elements to website (for now, they can change this at any time). So in theory you can "add" some code to the native function and prevent loading of scripts from specific sources. For example, you can run following code before you load GTM:
Node.prototype.insertBefore = (function() {
var cached_function = Node.prototype.insertBefore;
return function(script) {
if(script && script.src.indexOf("www.somesource.com/script.js") !== -1){ //change to src you don't want to load on your page
return false; //don't add the script
}else{
var result = cached_function.apply(this, arguments); // use .apply() to call native function
return result;
}
};
})();
(code taken from: Adding code to a javascript function programmatically)
I didn't test this code, so I am not advising you to do it without proper testing or you might decide not to do it at all(before you do you might want to read: Why is extending native objects a bad practice?). I agree with Eike's answer but all I am saying is that it is possible to prevent loading of custom tags programmatically.
You cannot block a specific custom HTML programmatically. One reason is that it would be pointless - "custom HTML" means "arbitrary code executed in the context of your site", so the code could simply be put into another HTML tag and be run from there.
That you are not in control of the instance of GTM that runs in your site (which effectively means that you are not in control of your site) is not a use case that Google could cater for in any meaningful way (if you are in control of GTM then simply remove the tag).
If you mean that you want to disallow scripts of a certain origin then you might look into Content Security Policies (which will work no matter if the scripts runs from GTM or any other source). However CSPs are notoriously hard to implement (and while it is possible to implement them from within GTM this works only for limited testing, not for production use).
I'm not a full-time Javascript developer. We have a web app and one piece is to write out a small informational widget onto another domain. This literally is just a html table with some values written out into it. I have had to do this a couple of times over the past 8 years and I always end up doing it via a script that just document.write's out the table.
For example:
document.write('<table border="1"><tr><td>here is some content</td></tr></table>');
on theirdomain.com
<body>
....
<script src='http://ourdomain.com/arc/v1/api/inventory/1' type='text/javascript'></script>
.....
</body>
I always think this is a bit ugly but it works fine and we always have control over the content (or a trusted representative has control such as like your current inventory or something). So another project like this came up and I coded it up in like 5 minutes using document.write. Somebody else thinks this is just too ugly but I don't see what the problem is. Re the widget aspect, I have also done iframe and jsonp implementations but iframe tends not to play well with other site's css and jsonp tends to just be too much. Is there a some security element I'm missing? Or is what I'm doing ok? What would be the strongest argument against using this technique? Is there a best practice I don't get?
To be honest, I don't really see a problem. Yes, document.write is very old-school, but it is simple and universally supported; you can depend on it working the same in every browser.
For your application (writing out a HTML table with some data), I don't think a more complex solution is necessary if you're willing to assume a few small risks. Dealing with DOM mutation that works correctly across browsers is not an easy thing to get right if you're not using jQuery (et al).
The risks of document.write:
Your script must be loaded synchronously. This means a normal inline script tag (like you're already using). However, if someone gets clever and adds the async or defer attributes to your script tag (or does something fancy like appending a dynamically created script element to the head), your script will be loaded asynchronously.
This means that when your script eventually loads and calls write, the main document may have already finished loading and the document is "closed". Calling write on a closed document implicitly calls open, which completely clears the DOM – it's esentially the same as wiping the page clean and starting from scratch. You don't want that.
Because your script is loaded synchronously, you put third-party pages at the mercy of your server. If your server goes down or gets overloaded and responds slowly, every page that contain your script tag cannot finish loading until your server does respond or the browser times out the request.
The people who put your widget on their website will not be happy.
If you're confident in your uptime, then there's really no reason to change what you're doing.
The alternative is to load your script asynchronously and insert your table into the correct spot in the DOM. This means third parties would have to both insert a script snippet (either <script async src="..."> or use the dynamic script tag insertion trick. They would also need to carve out a special <div id="tablegoeshere"> for you to put your table into.
Using document.write() after loading the entire DOM do not allow you to access DOM any further.
See Why do I need to use document.write instead of DOM manipulation methods?.
You are in that case putting away a very powerfull functionnality of in web page...
Is there a some security element I'm missing?
The security risk is for them in that theirdomain.com trusting your domain's script code to not do anthing malicous. Your client script will run in the context of their domain and can do what it likes such as stealing cookies or embedding a key logger (not that you would do that of course). As long as they trust you, that is fine.
Here is a wireframe to ilustrate the structure of a legacy project that shows some performance issue:
For all dialogs (From jQuery UI) open a new iframe are created and all js from Home are re-downloaded and all objects are re-instanced. Can I create a reference from jQuery from Home to all new iframes and work in each iframe isolated scope?
For example:
[Home scope]
$("#some-el").data('foo', 'bar');
console.log($("#some-el").data('foo')); // results bar
[App1 scope]
//after defined in Home first run
console.log($("#some-el").data('foo')); // results undefined
PS: Remember this is a legacy architeture and all solutions must be consider this scenario.
I've encountered this situation before. One approach is to define some javascript that gets loaded into the iframes that just reroutes any function calls to top.functionCall() instead of containing their actual definition. It becomes very simple if all of your functions are under one namespace, like so:
Parent window js:
var namespace = (function () {
// all of your functions are in here as properties of namespace
})();
iframe window js:
var namespace = top.namespace;
One issue with this is any context sensitive functions (functions that rely or operate on the window object) will most likely break.
Actually, if all of these are hosted in the same place, the browser will NOT be downloading the files multiple times. Rather, it will be caching the first result, and then pulling from cache in the second. Iframes are treated as a separate context, so you won't need to worry about variable or form conflicts.
Assuming that downloading the same file twice is your primary concern, then you should be ok there.
An alternative design would be to use AJAX instead of iframed content - but having been where you are in working with legacy apps, I realize how hard that can be to do without real JSON / REST calls available. One thing I've done is changed the views inside the iframes to be "partials," returning only the necessary HTML contents without the HTML head etc., and loading them using $.load(). This gets complex as you will need to execute bindings post-load and carefully track form ID's etc., but it can be done.
I've been struggling with a problem for a few hours now, and I would appreciate either some help in accomplishing my goal, or confirmation that what I'm trying to do is in fact impossible.
I have a webapp that takes the selected text (document.getSelection()) as input, from an arbitrary webpage. While it would be possible to use a bookmarklet to do such scripting fairly easily, it's best for the end-user if I can accomplish this with an iframe.
The parent frame is my site with this script:
$('#frame').load(function(){
// this event won't be triggered
$(window).mouseup(function(){
doStuff(window.getSelection());
});
// this will throw a security error
$(window.frames[0].document).mouseup(function(){
doStuff(window.frames[0].document.getSelection());
});
});
An arbitrary site is in the child frame. Unless the child document is from my domain, access is forbidden for XSS security reasons. I've tried several variations and attempted hacks, including setting the iframe src to my domain with the third party URL as an argument, and then redirecting to the third party URL. In a sense, I'm glad that it didn't work (because if it did, then XSS security would still have a long way to go...)
Another option would be downloading the third party page and serving it from my domain like a proxy server, but I've already run into a bunch of problems with relative paths to files, which are sometimes easy to make absolute, but sometimes a fool's errand (such as when the files are accessed via script).
I've concluded that I might just be out of luck. Perhaps an important distinction for my case is that I only want to access the .getSelection() method for the child. No need to be able to access cookies or keystrokes or interact with the DOM. Maybe it doesn't make a difference, but maybe it does.
You could try the proxy method but insert a base tag that points to the original domain. The paths should be taken care of then.
I wouldn't rely on any XSS hacks even if you could find them -- they'd likely be corrected and most likely not crossbrowser.
one possibility is to write your iframe's document with text from an XHR, or use jQuery's load() function, this only works if there is no navigation in the iframe though.
I didn't fully read your answer :-) but maybe I have a solution,
it involves passing data between parent and child frames, both ways.
You can WRITE (and not READ) the hash (hash is the http://url#HASH_PART) of parent from child.
So, on the parent iframe, just set interval to check the value, say every 50ms.
function checkHash() {
if (window.location.hash == "#something") {
// my child frame set this value using:
//parent.window.location.hash = "something";
doSomething();
}
}
For further details, and maybe doing parent to child communications, then
the full article explaining this (also has demo link) can be found here.