Remove target attribute in iframe link - javascript

I have an iframe on one of my pages that shows content on an external site (vendor product). All works well except a few links that have target="_main" in them. These links open in a new tab. What I need to do is strip the target attribute from all links within the iframe so all links stay within the iframe rather than opening a new window or tab.
It seems like there should be a simple javascript solution to this.
If I can't get this to work in an iframe then I will be forced to re-create all the content on my site which would be very painful..... to say the least.
Any help???

You need access to the external site's codebase in order to dynamically fix this. What you want to do in the external site's codebase is to check if the sites is within an iframe. If it is within an iframe then run a function to remove all target attributes on links.
// vendors product page
if ( self !== top ){
$('a').removeAttr('target');
} // else do nothing
self !== top is the same as saying if my site isn't the top most window then return true.

Not directly that I am aware of.
However, if you have access to a scripting language (like PHP or ASP) on your site you can read your vendors' page directly from your server, do a find & replace on it & then render that onto your site; either in an iframe or however else you want.
Edit
There are many ways to do this, depending on how much control you have over you PHP config. Have a look at these resources & see if you can figure out what to do. If not I would suggest you start a new question specifically focused on what it is you are struggling with.
http://php.net/manual/en/function.file-get-contents.php With this method you have to be aware of the tip on the page:
A URL can be used as a filename with this function if the fopen wrappers have been enabled. See fopen() for more details on how to specify the filename. See the Supported Protocols and Wrappers for links to information about what abilities the various wrappers have, notes on their usage, and information on any predefined variables they may provide.
http://php.net/manual/en/function.fsockopen.php Again, be aware of the warning & notes.
http://php.net/manual/en/book.curl.php
I personally have written a class that uses fsockopen because it is the most flexible for my needs but usually file_get_contents does the trick because it is the simplest to set up out of the 3 options, if you have the right wrappers configured & you don't need to start working with SSL or funny protocols. I stay away from CURL because you have to install a library in order for it to work. I prefer my code to be portable for standard installs.
Some useful links that might help:
PHP readfile from external server
Possible Example
$vendorUrl = isset( $_REQUEST['vendor'] ) ? $_REQUEST['vendor'] : 'www.default-vendor.com';
$iframeContents = file_get_contents("http://$vendorUrl", false);
exit str_replace( 'target="_main"', '', $iframeContents );
Then you just have point your iframe at whatever page you save this script in on your server & include ?vender=www.vendor-url.com as the query string.

How about giving your own iframe the name _main?
<iframe name="_main" ...
The other links should then open in that iframe too.
Regards, Max

Related

Edit cross-domain Iframe content Locally Only

As many of us know there is no way to edit a Cross Domain IFrame due to the Same Origin Policy.
Is there a way around this if we use the Stylish extension etc. locally only?
Take this video being launched inside an iframe for example:
I need to simply add "zoom:2;" onto "#video21588864 iframe figure"
If this is 100% not possible, why am I able to do it successfully in the Inspector window, but not automatically? Is there really ZERO automatic local ways around this using Javascript or something?
There is no way you can access the content inside the <iframe> in a cross-origin fashion. You might be able to if the response includes a CORS header.
Why am I able to do it successfully in the Inspector window
The developer tools is separate from your document. It can do much more things that you cannot possibly do with normal JavaScript in a webpage.
Rationale
There is a reason why you cannot access the content inside an iframe. Consider this, a user was logged into their bank webpage. A token is stored in a cookie to prove that the user is logged in.
Now you include an iframe in your webpage and loads the bank's webpage. Since the cookie contains a valid token, the iframe will show that the user has been logged in.
Wouldn't it be great if you can access the iframe and send yourself some money? Well, this is exactly why it's not allowed, 100% not possible, given that the browser is implemented correctly.
Addendum
I have decided to add this part after seeing that you have mentioned the word locally. Now, I do not know exactly what you are trying to do, but it is possible to manipulate the content inside the iframe if you have an elevated privilege, including:
a userscript
an extension with proper permissions acquired
developer tools
the browser itself
If you merely want to add zoom: 2 to videos from ESPN on your own computer, I would suggest creating a userscript which has a much higher privilege than a normal webpage and much easier to make than an extension.
// ==UserScript==
// #match http://www.espn.com/core/video/iframe*
// ==/UserScript==
document.querySelector("figure").style.zoom = 2;
Save that as myscript.user.js. Open up chrome://extensions and drag that file onto the page. The userscript will have a higher privilege and can access the page.
One way to edit cross-origin domains in an iframe is to load them via the server (PHP) and modify the html by adding a base tag: <base href='http://www.espn.com'/> It's no guarantee that they will let you load all the elements as html and still render the page properly but can work in some cases and is worth the try.
A very simple iframe-loader.php would look like this:
<?php
error_reporting(0);
$url = $_REQUEST['url'];
$html = file_get_contents($url);
$dom = new domDocument;
$dom->strictErrorChecking = false;
$dom->recover = true;
$dom->loadHTML($html);
//Add base tag
$head = $dom->getElementsByTagName('head')->item(0);
$base = $dom->createElement('base');
$base->setAttribute('href',$url);
if ($head->hasChildNodes()) {
$head->insertBefore($base,$head->firstChild);
} else {
$head->appendChild($base);
}
//Print result
echo $dom->saveHTML();
DEMO
Then you load a url by going to /iframe-loader.php?url=http://www.espn.com/core/video/iframe?id=21588864...
Good Luck!

Can I use indexeddb across subdomains?

I'm building a Chrome extension and using the db.js wrapper to utilize the indexeddb. The problem is, I've got several subdomains and I'd like to be able to share the information across them.
When I use the Chrome Dev tools to view Resources, all of the individual subdomains have their own copy of the schema I'm creating, and each has it's own data.
The only thing I knew to try was to set the document.domain but that didn't help. I wasn't surprised.
Documentation on indexeddb is very slim it seems. I keep finding the same 2 or 3 blog posts copied word for word in several different blogs and nothing specifies that this is possible or impossible.
You can't access the same database from multiple subdomains, the access scope is limited to html origin.
html_Origin = protocol + "://" + hostname + ":" + port + "/";
As #Xan mentioned, if you can use a common origin owned by the extension itself, rather than by the content pages, that sounds like it would be by far the easiest solution. If for whatever reason you can't do that (or for readers who got here wanting to know about regular page javascript or Greasemonkey-style userscripts, rather than extensions), the answer is:
Yes, though it's a slightly awkward and takes some work:
Since you're using a number of related subdomains, (rather than completely unrelated domains), there's a technique you can use in that situation. It can be applied to IndexedDB, localStorage, SharedWorker, BroadcastChannel, etc, all of which offer shared functionality between same-origin pages, but for some reason don't respect modifications to document.domain.
(1) Pick one "main" subdomain to for the data to belong to. i.e. if your subdomains are https://a.example.com, https://b.example.com, and https://c.example.com, you might choose to have your IndexedDB database stored under the https://a.example.com subdomain.
(2) Use it normally from all the the https://a.example.com pages.
(3) On https://b.example.com and https://c.example.com, use javascript to set document.domain = "example.com";. Then also create a hidden <iframe>, and navigate it to some page on the https://a.example.com domain (It doesn't matter what page, as long as you can insert a very little snippet of javascript on there. If you're creating the site, just make an empty page specifically for this purpose. If you're writing an extension or a userscript and so don't have any control over pages on the example.com server, just pick the most lightweight page you can find and insert your script into it. Some kind of "not found" page would probably be fine).
(4) The script on the hidden iframe page need only (a) set document.domain = "example.com";, and (b) notify the parent window when this is done. After that, the parent window can access the iframe window and all its objects without restriction! So the minimal iframe page is something like:
<!doctype html>
<html>
<head>
<script>
document.domain = "example.com";
window.parent.iframeReady(); // function defined & called on parent window
</script>
</head>
<body></body>
</html>
If writing a userscript, you might not want to add externally-accessible functions such as iframeReady() to your unsafeWindow, so instead a better way to notify the main window userscript might be to use a custom event:
window.parent.dispatchEvent(new CustomEvent("iframeReady"));
Which you'd detect by adding a listener for the custom "iframeReady" event to your main page's window.
(5) Once the hidden iframe has informed its parent window that it's ready, script in the parent window can just use iframe.contentWindow.indexedDB, iframe.contentWindow.localStorage, iframe.contentWindow.BroadcastChannel, iframe.contentWindow.SharedWorker instead of window.indexedDB, window.localStorage etc. ...and all these objects will be scoped to the https://a.example.com origin - so they'll have the this same shared origin for all of your pages!
The "awkward" part of this technique is mostly that you have to wait for the iframe to load before proceeding. So you can't just blithely initialize IndexedDB in your DOMContentLoaded handler, for example. Also you might want to add some error handling to detect if the hidden iframe fails to load correctly.
Obviously, you should also make sure the hidden iframe is not removed or navigated during the lifetime of your page... OTOH I don't know what the result of that would be, but very likely bad things would happen.
And, a caveat: setting/changing document.domain can be blocked using the Feature-Policy header, in which case this technique will not be usable as described.
However, there is a significantly more-complicated generalization of this technique, that can't be blocked by Feature-Policy, and that also allows entirely unrelated domains to share data, communications, and shared workers (i.e. not just subdomains off a common superdomain). #Xan alludes to it in point (2) of his answer:
The general idea is that, just as above, you create a hidden iframe to provide the correct origin for access; but instead of then just grabbing the iframe window's properties directly, you use script inside the iframe to do all of the work, and you communicate between the iframe and your main window only using postMessage() and addEventListener("message",...).
This works because postMessage() can be used even between different-origin windows. But it's also significantly more complicated because you have to pass everything through some kind of messaging infrastructure that you create between the iframe and the main window, rather than (for example) just using the IndexedDB API directly in your main window's code.
HTML-based storage (indexedDB, localStorage) in Chrome extensions behaves in a way that might not be expected, but it's perfectly natural.
In the background page, the domain is chrome-extension://yourextensionid/, and this is shared by all extension pages and is persistent.
In the content scripts though, you're sharing the HTML storage with the domain you're operating on. This makes life difficult if you want it to share/persist things. Note that sometimes this behavior is actually helpful.
The universal solution is to keep the DB in a background script, and communicate data/requests by means of Messaging API.
This was the usual solution for localStorage use until chrome.storage came along. But since you're using a database, you don't have a ready extension-friendly replacement.

Reading document.links from an IFrame

EDIT:
Just a quick mention as to the nature of this program. The purpose of this program is for web inventory. Drawing different links and other content into a type of hierarchy. What I'm having trouble with is pulling a list of links from a webpage within an IFrame.
I get the feeling this one is gonna bite me hard. (other posts indicate relevance to xss and domain controls)
I'm just trying something with javascript and Iframes. Basically I have a panel with an IFrame inside that goes to whatever website you want it to. I'm trying to generate a list of links from the webpage within the Iframe. Its strictly read only.
Yet I keep coming up against the permission denied problem.
I understand this is there to stop cross site scripting attacks and the resolution seems to be to set the document domain to the host site.
JavaScript permission denied. How to allow cross domain scripting between trusted domains?
However I dont think this will work if I'm trying to go from site to site.
Heres the code I have so far, pretty simple:
function getFrameLinks()
{
/* You can all ignore this. This is here because there is a frame within a frame. It should have no effect ont he program. Just start reading from 'contentFrameElement'*/
//ignore this
var functionFrameElem = document.getElementById("function-IFrame");
console.log("element by id parent frame ");
console.log(functionFrameElem);
var functionFrameData = functionFrameElem.contentDocument;
console.log("Element data");
console.log(functionFrameData);
//get the content and turn it into a doc
var contentFrameElem = functionFrameData.getElementById("content-Frame")
console.log(contentFrameElem);
var contentFrameData = contentFrameElem.contentDocument;
console.log(contentFrameData);
//get the links
//var contentFrameLinks = contentFrameData.links;
var contentFrameLinks = contentFrameData.getElementsByTagName('a');
Goal: OK so due to this being illegal and very similar to XSS. Perhaps someone could point out a solution as to how to locally store the document. I dont seem to have any problems accessing document.links with internal pages in the frame.
Possibly some sort of temp database of cache. The simpler the solution the better.
If you want to read it just for your self and in your browser, you can write a simple proxy with php in your server. the most simple code:
<?php /* proxy.php */ readfile($_GET['url']); ?>
now set your iframe src to your proxy file:
<iframe src="http://localhost/proxy.php?url=http://www.google.com"
id="function-IFrame"></iframe>
now you can access the iframe content from your (local) server.
if you want set the url with a program remember to encode the url (urlencode in php or encodeURIComponent in js)
Here is a bookmarklet you can run on any page (assuming the links are not in an iframe)
javascript:var x=function(){var lnks=document.links,list=[];for (var i=0,n=lnks.length;i<n;i++) {var href = lnks[i].href; list.push(href)};if (list.length>0) { var w=window.open('','_blank');w.document.write(list.length+' links found<br/><ul><li>'+list.sort().join('</li><li>')+'</ul>');w.document.close()}};void(x());
the other way is for you (on Windows) to save your HTML with extension .HTA
Then you can grab whatever lives in the iFrame
You might be interested in using the YQL (Yahoo Query Language) to retrieve filtered results from remote urls..
example of retrieving all the links from the yahoo.com domain

How to get url of embedding page for a javascript widget

(Rewording the question as there were very few views otherwise).
I want to build a widget that others can include on their website, and the widget itself will be hosted on my website. I am aware of just one method to build widgets that can be embedded on other websites: The website that wants to embedd the widget sources a javascript from my site, which does "document.write" on the page. Something like:
<script language="javascript" src="http://www.my-website-that-will-host-the-widget.com/javascript-emitter.php?id=1234&width=200&bordercolor=000000&bg=ffffff&textcolor=000000"></script>
Now, I want to make a particular widget accessible from only particular domains. For this, I want to know the URL of the page that is embedding my widget reliably . No-one should be able to spoof it. For example, if I have an explicit variable in the embedding code, people can change it.
How do I do it? (I also want that there minimal code to write for the person who is embedding my widget).
regards,
JP
Explanation 1:
Lets say I want to do this: If widget is accessed from 1.com, display A, else display B. How do I do it reliably. Thing is, "A" is something that should not be visible in the code unless the widget is accessed from 1.com. (Thus, if it is embedded in 2.com, I don't want to output if(location.href == 1.com) write(A) else write (B)
Note 1:
(As an aside, if someone feels my method is not good/efficient and can suggest better methods/tutorials, etc., that would be great help. Most google queries give you sites that explain how to build/obtain widget for "your site".... and usually point to websites that allow you to build widgets hosted with them, I want to understand how to build widgets that can be embedded by other websites from my site)
In javascript on the client-side, you can use location.href to get the url of the current page:
var url = location.href;
If you do not want to output any javascript at all for a forbidden domain, in your php you can check the HTTP_REFERER header with the global variable $HTTP_REFERER. In your javascript-emitter.php script try this:
<?php
echo $HTTP_REFERER;
?>
However be warned that this is not always to be trusted: it is up to the client (the browser) to send the correct REFERER header. And of course if someone really wanted to include your widget on their site, they could easily request your javascript server-side spoofing the REFERER header - that is set it to something that's on your whitelist - before forwarding it to the client.
In short there's no way you can easily and absolutely block blacklisted sites from using your widget.

Is there a way to mitigate downloading of resources (images/css and js files) with Javascript?

I have a html page on my localhost - get_description.html.
The snippet below is part of the code:
<input type="text" id="url"/>
<button id="get_description_button">Get description</button>
<iframe id="description_container" src="#"/>
When the button is clicked the src of the iframe is set to the url entered in the textbox. The pages fetched this way are very big with lots of linked files. What I am interested in the page is a block of text contained in a <div id="description"> element.
Is there a way to mitigate downloading of resources linked in the page that loads into the iframe?
I don't want to use curl because the data is only available to logged in users and the steps to take with curl to get the content is too complicated. The iframe is simple as I use this on a box which sends the right cookies to identify the request as coming from a logged in user, but the problem is that it is very wasteful to get nearly 1 MB of data to keep 1 KB of it and throw out the rest.
Edit
If the proposed method just works in Firefox it is fine, so I added Firefox tag. Also, it is possible that the answer actually is from the realm of Firefox add-on techniques, so I added that tag as well.
The problem is not that I cannot get at what I'm looking for, rather, the problem is the easy iframe method is wasteful.
I know that Firefox does allow loading only the text of a page. If you open a page and press Ctrl+U you are taken to 'view page source' window, There links behave as normal and are clickable, if you click on a link in source view, the source of the new page is loaded into the view source window, without the linked resources being downloaded, exactly what I'm trying to get. But I don't know how to access this behaviour.
Another example is the Adblock add-on. It somehow kills elements before they get loaded. With plain Javascript this is not possible. Because it only is triggered too late to intervene in good time.
The Same Origin Policy forbids any web page to access contents of any other web page in a different domain so basically you cannot do that.
However it seems that with some browsers it is allowed to access web pages content if you are trying to access it from a local web page which seems to be your case.
Safari, IE 6/7/8 are browser that allow a local web page to do so via XMLHttpRequest (source: Google Browser Security Handbook) so you may want to choose to use one of those browsers to do what you need (note that future versions of those browsers may not allow to do so anymore).
A part from this solution I only see two possibities:
If the web pages you need to fetch content from are somehow controlled by you, you can create a simpler interface to let other web pages to get the content you need (for example allowing JSONP requests).
If the web pages you need to fetch content from are not controlled by you the only solution I see is to fetch content server side logging in from the server directly (I know that you don't want to do so, but I don't see any other possibility if the previous I mentioned are not practicable)
Hope it helps.
Actually I've seen Cross Domain jQuery .load request before, here: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The author claims that codes like these found on that page
$('#container').load('http://google.com'); // SERIOUSLY!
$.ajax({
url: 'http://news.bbc.co.uk',
type: 'GET',
success: function(res) {
var headline = $(res.responseText).find('a.tsh').text();
alert(headline);
}
});
// Works with $.get too!
would work. (The BBC code might not work because of the recent redesign, but you get the idea)
Apparently it is using YQL wrapped into a jQuery plugin to do the trick. Now I cannot say I fully understand what he is doing there but it appears to work, and fits the bill. Once you load the data I suppose it is a simple matter of filtering out the data that you need.
If you prefer something that works at the browser level, may I suggest Mozilla's Jetpack framework for lightweight extensions. I've not yet read the documentations in its entirety but it should contain the APIs needed for this to work.
There are various ways to go about this in AJAX, I'm going to show the jQuery way for brevity as one option, though you could do this in vanilla JavaScript as well.
Instead of an <iframe> you can just use a container, let's say a <div> like this:
<div id="description_container"></div>
Then to load it:
$(function() {
$("#get_description_button").click(function() {
$("#description_container").load($("input").val() + " #description");
});
});
This uses the .load() method which takes a string in this format: .load("url selector"), then takes that element in the page and places it's content inside the container you're loading, in this case #description_container.
This is just the jQuery route, mainly to illustrate that yes, you can do what you want, but you don't have to do it exactly like this, just showing the concept is getting what you want from an AJAX request, rather than in an <iframe>.
Your description sounds like you are fetching pages from the same domain (you said that you need to be logged in and have session credentials) so have you tried to use async request via XMLHttpRequest? It might complain if the html on a page is particularly messed up but you chould still be able to get raw text via .responseText and extract what you need with a regex.

Categories