Getting the HTML content of an iframe using jQuery - javascript

I'm currently trying to customize OpenCms (java-based open source CMS) a bit, which is using the FCKEditor embedded, which is what I'm trying access using js / jQuery.
I try to fetch the html content of the iframe, however, always getting null as a return.
This is how I try to fetch the html content from the iframe:
var editFrame = document.getElementById('ta_OpenCmsHtml.LargeNews_1_.Teaser_1_.0___Frame');
alert( $(editFrame).attr('id') ); // returns the correct id
alert( $(editFrame).contents().html() ); // returns null (!!)
Looking at the screenshot, the what I want to access is the 'LargeNews1/Teaser' html section, which currently holds the values "Newsline en...".
Below you can also see the html structure in Firebug.
However, $(editFrame).contents().html() returns null and I can't figure out why, whereas $(editFrame).attr('id') returns the correct id.
The iframe content / FCKEditor is on the same site/domain, no cross-site issues.
HTML code of iframe is at http://pastebin.com/hPuM7VUz
Updated:
Here's a solution that works:
var editArea = document.getElementById('ta_OpenCmsHtml.LargeNews_1_.Teaser_1_.0___Frame').contentWindow.document.getElementById('xEditingArea');
$(editArea).find('iframe:first').contents().find('html:first').find('body:first').html('some <b>new</b><br/> value');

.contents().html() doesn't work to get the HTML code of an IFRAME. You can do the following to get it:
$(editFrame).contents().find("html").html();
That should return all the HTML in the IFRAME for you. Or you can use "body" or "head" instead of "html" to get those sections too.

you can get the content as
$('#iframeID').contents().find('#someID').html();
but frame should be in the same domain refer http://simple.procoding.net/2008/03/21/how-to-access-iframe-in-jquery/

I suggest replacing the first line with:
var editFrame = $('#ta_OpenCmsHtml.LargeNews_1_.Teaser_1_.0___Frame');
...and the 2nd alert expression with:
editFrame.html()
If, on the other hand, you prefer to accomplish the same w/o jquery (much cooler, IMHO) could use only JavaScript:
var editFrame = document.getElementById('ta_OpenCmsHtml.LargeNews_1_.Teaser_1_.0___Frame');
alert(editFrame.innerHTML);

After trying a number of jQuery solutions that recommended using the option below, I discovered I was unable to get the actual <html> content including the parent tags.
$("#iframeId").contents().find("html").html()
This worked much better for me and I was able to fetch the entire <html>...</html> iframe content as a string.
document.getElementById('iframeId').contentWindow.document.documentElement.outerHTML

I think the FCKEditor has its own API see http://cksource.com/forums/viewtopic.php?f=6&t=8368

Looks like jQuery doesn't provide a method to fetch the entire HTML of an iFrame, however since it provides access to the native DOM element, a hybrid approach is possible:
$("iframe")[0].contentWindow.document.documentElement.outerHTML;
This will return iFrame's HTML including <THTML>, <HEAD> and <BODY>.

Your iframe:
<iframe style="width: 100%; height: 100%;" frameborder="0" aria-describedby="cke_88" title="Rich text editor, content" src="" tabindex="-1" allowtransparency="true"/>
We can get the data from this iframe as:
var content=$("iframe").contents().find('body').html();
alert(content);

Related

How can you load an external web page into an Iframe with Jquery with Custom variables?

Can you load a page into an iframe with JQuery? I have a page that creates a custom printable pdf and need it to load into an iframe to make it easier for the user. I use jquery to pull in all the variables otherwise I could have it load within the page. I am not sure what I am missing with this command to load the page within id="print_form_modal2"?
$.frameReady(function(){
$("#print_form").prepend('<div id="newDiv"></div>');
$('#newDiv').load("print_audit.php?auditID="+auditID+"&action=print&print_name="+print_name+"&print_orient="+print_orient+"&download_option="+download_option+"&type=pdf");
}));
<iframe id="print_form_modal2" name="iFrame" src="">
You could do it the painless way and just use HTML:
Make an <a>nchor with the href to your PDF.
Add an iframe with a name attribute (ex. name="iframe1")
Next, add a target="iframe1" to the <a>.
PLUNKER
Its simple enough to do what you're trying to do using just JavaScript and HTML
HTML:
<iframe id="print_form_modal2" name="iFrame">
JavaScript:
function openIframe() {
document.getElementById("print_form_modal2").setAttribute("src", "https://www.example.com/");
}
You can see the code in a CodePen here: http://codepen.io/anon/pen/VjbBbY
In your code you need to replace https://www.example.com/ with the source path for PDF you wish to display, and change when openIframe is called to suit your requirements.
Here's a link to the codepen example
What you want to do on document ready (or whatever event is relevant to your logic) get the iframe and using the attr method change its source property to point to whatever new/old source.
Like this:
$(document).ready(function() {
$('#iframe-container').attr('src','http://www.w3schools.com/tags/tag_iframe.asp');
});

jQuery: html() function does not match real HTML

I am trying to get the EXACT html content of a div.
When using the html() function from jQuery, the result does not match the actual content.
Please check this fiddle and click on the black square:
http://jsfiddle.net/qRska/6/
The code:
<div id="mydiv" style="width:100px; height: 100px; background-color:#000000; cursor:pointer;">
<div id="INSIDE" style="background-color:#ffffff; border-style:none;"></div>
</div>
$('#mydiv').click(function() {
alert($(this).html());
});
jQuery change the color to RGB format and remove the border-style attribute.
How can I solve this problem?
The browser consumes the HTML, generates a DOM, then discards the HTML. innerHTML (which is what .html() eventually hits) gives a serialisation of the DOM back to HTML.
If you want to get the raw HTML, then you'll need to use XMLHttpRequest to fetch the source code of the current URL and then process it yourself.
What you want to do is unfortunately not possible. The original HTML is not available after it is parsed by the browser, so you have to jump through some hoops to prevent the browser from processing it.
One possible solution that I've used before is to wrap the HTML in comment tags, which would remain unchanged by the browser. You can then extract the comment using jQuery's .text() method; strip out the comment tags with string replacement; make the necessary changes to the markup; and then inject it back into the document.
The other alternative is to use AJAX to load the HTML. Make sure you set the contentType to 'text' so it doesn't get processed by the browser.

jQuery parse HTML without loading images

I load HTML from other pages to extract and display data from that page:
$.get('http://example.org/205.html', function (html) {
console.log( $(html).find('#c1034') );
});
That does work but because of the $(html) my browser tries to load images that are linked in 205.html. Those images do not exist on my domain so I get a lot of 404 errors.
Is there a way to parse the page like $(html) but without loading the whole page into my browser?
Actually if you look in the jQuery documentation it says that you can pass the "owner document" as the second argument to $.
So what we can then do is create a virtual document so that the browser does not automatically load the images present in the supplied HTML:
var ownerDocument = document.implementation.createHTMLDocument('virtual');
$(html, ownerDocument).find('.some-selector');
Use regex and remove all <img> tags
html = html.replace(/<img[^>]*>/g,"");
Sorry for resuscitating an old question, but this is the first result when searching for how to try to stop parsed html from loading external assets.
I took Nik Ahmad Zainalddin's answer, however there is a weakness in it in that any elements in between <script> tags get wiped out.
<script>
</script>
Inert text
<script>
</script>
In the above example Inert text would be removed along with the script tags. I ended up doing the following instead:
html = html.replace(/<\s*(script|iframe)[^>]*>(?:[^<]*<)*?\/\1>/g, "").replace(/(<(\b(img|style|head|link)\b)(([^>]*\/>)|([^\7]*(<\/\2[^>]*>)))|(<\bimg\b)[^>]*>|(\b(background|style)\b=\s*"[^"]*"))/g, "");
Additionally I added the capability to remove iframes.
Hope this helps someone.
Using the following way to parse html will load images automatically.
var wrapper = document.createElement('div'),
html = '.....';
wrapper.innerHTML = html;
If use DomParser to parse html, the images will not be loaded automatically. See https://github.com/panzi/jQuery-Parse-HTML/blob/master/jquery.parsehtml.js for details.
You could either use jQuerys remove() method to select the image elements
console.log( $(html).find('img').remove().end().find('#c1034') );
or remove then from the HTML string. Something like
console.log( $(html.replace(/<img[^>]*>/g,"")) );
Regarding background images, you could do something like this:
$(html).filter(function() {
return $(this).css('background-image') !== '';
}).remove();
The following regex replace all occurance of <head>, <link>, <script>, <style>, including background and style attribute from data string returned by ajax load.
html = html.replace(/(<(\b(img|style|script|head|link)\b)(([^>]*\/>)|([^\7]*(<\/\2[^>]*>)))|(<\bimg\b)[^>]*>|(\b(background|style)\b=\s*"[^"]*"))/g,"");
Test regex: https://regex101.com/r/nB1oP5/1
I wish there is a a better way to work around (other than using regex replace).
Instead of removing all img elements altogether, you can use the following regex to delete all src attributes instead:
html = html.replace(/src="[^"]*"/ig, "");

Most secure javascript JSON Inline technique

I'm using varnish+esi to return external json content from a RESTFul API.
This technique allows me to manage request and refresh data without using webserver resources for each request.
e.g:
<head>
....
<script>
var data = <esi:include src='apiurl/data'>;
</script>
...
After include the esi varnish will return:
var data = {attr:1, attr2:'martin'};
This works fine, but if the API returns an error, this technique will generate a parse error.
var data = <html><head><script>...api js here...</script></head><body><h1 ... api html ....
I solved this problem using a hidden div to parse and catch the error:
...
<b id=esi-data style=display:none;><esi:include src='apiurl/data'></b>
<script>
try{
var data = $.parseJSON($('#esi-data').html());
}catch{ alert('manage the error here');}
....
I've also tried using a script type text/esi, but the browser renders the html inside the script tag (wtf), e.g:
<script id=esi-data type='text/esi'><esi:include src='apiurl/data'></script>
Question:
Is there any why to wrap the tag and avoid the browser parse it ?
Let me expand upon the iframe suggestion I made in my comment—it's not quite what you think!
The approach is almost exactly the same as what you're doing already, but instead of using a normal HTML element like a div, you use an iframe.
<iframe id="esi-data" src="about:blank"><esi:include src="apiurl/data"></iframe>
var $iframe = $('#esi-data');
try {
var data = $.parseJSON($iframe.html());
} catch (e) { ... }
$iframe.remove();
#esi-data { display: none; }
How is this any different from your solution? Two ways:
The data/error page are truly hidden from your visitors. An iframe has an embedded content model, meaning that any content within the <iframe>…</iframe> tags gets completely replaced in the DOM—but you can still retrieve the original content using innerHTML.
It's valid HTML5… sort-of. In HTML5, markup inside iframe elements is treated as text. Sure, you're meant to be able to parse it as a fragment, and it's meant to contain only phrasing content (and no script elements!), but it's essentially just treated as text by the validator—and by browsers.
Scripts from the error page won't run. The content gets parsed as text and replaced in the DOM with another document—no chance for any script elements to be processed.
Take a look at it in action. If you comment out the line where I remove the iframe element and inspect the DOM, you can confirm that the HTML content is being replaced with an empty document. Also note that the embedded script tag never runs.
Important: this approach could still break if the third party added an iframe element into their error page for some reason. Unlikely as this may be, you can bulletproof the approach a little more by combining your technique with this one: surround the iframe with a hidden div that you remove when you're finished parsing.
Here I go with another attempt.
Although I believe you already have the possibly best solution for this, I could only imagine that you work around it with a fairly low-performance method of calling esi:insert in a separate HTML window, then retrieve the contents as if you were using AJAX on the server. Perhaps similar to this? Then check the contents you retrieved, maybe by using json_decode and on success generate an error JSON string.
The greatest downside I see to this is that I believe this would be very consuming and most likely even delays your requests as the separate page is called as if your server yourself was a client, parsed, then sent back.
I'd honestly stick to your current solution.
this is a rather tricky problem with no real elegant solution, if not with no solution at all
I asked you if it was an HTML(5) or XHTML(5) document, because in the later case a CDATA section can be used to wrap the content, changing slightly your solution to something like this :
...
<b id='esi-data' style='display:none;'>
<![CDATA[ <esi:include src='apiurl/data'> ]]>
</b>
<script>
try{
var data = $.parseJSON($('#esi-data').html());
}catch{ alert('manage the error here');}
....
Of crouse this solution works if :
you're using XHTML5 and
the error contains no CDATA section (because CDATA section nesting is impossible).
I don't know if switching from one serialization to the other is an option, but I wanted to clarify the intent of my question. It will hopefully help you out :).
Can't you simply change your API to return JSON { "error":"error_code_or_text" } on error? You can even do something meaningful in your interface to alert user about error if you do it that way.
<script>var data = 999;</script>
<script>
data = <esi:include src='apiurl/data'>;
</script>
<script>
if(data == 999) alert("there was an error");
</script>
If there is an error and "data" is not JSON, then a javascript error will be thrown. The next script block will pick that up.

How to append the HTML results of a URL call to the DOM?

I have a URL that resides on another domain, like this:
http://ads.adserver.com/ad?site=1233&zone=45435
When you type this URL in the browser, the result is HTML like this:
<img src="htt://ads.adserver.com/i/image.gif" border="0"/><br/>Test
The above renders as an image wrapped in a link with a second link below it.
I tried to capture this URL in a script tag and append it to the DOM, but it does not render the HTML above.
var ad_script = document.createElement('script');
ad_script.type = 'text/javascript';
ad_script.src = 'http://ads.adserver.com/ad?site=1233&zone=45435';
li.appendChild(ad_script);
Are there any other ways of invoking this URL and putting the result on the page? I can't use $.getScript() since I'm not invoking this in the global context. I need this HTML to appear exactly where I want it to appear.
EDIT: The only reason I am trying this route is that the third-party does not provide a JSON-P interface.
EDIT2: Unfortunately, I am not on an application server.
EDIT3: This is for iPhone.
You are limited to how you load this due to cross-domain issues .. an easy way would be to load the image in it's own iframe
<iframe src='http://ads.adserver.com/ad?site=1233&zone=45435' height='200px' width='200px' />
You'll want to use ajax for this. The easiest way might be jQuery's .load() method.
Assuming the element you want the content to go into has an id of holder, you would do
('#holder').load('http://ads.adserver.com/ad?site=1233&zone=45435')
It will put the contents of the webpage into your selected element. http://api.jquery.com/load/
edit: Sorry, forgot cross-site ajax limitations. You could instead set up a php page like:
if(isset($_GET['url'])){
echo file_get_contents($_GET['url'])
}
and then do
('#holder').load('http://yoursite.com/yourpage.php?url=http://ads.adserver.com/ad?site=1233&zone=45435')
Take the jsonp approach. use an actual script tag and put an executing javascript function as the response that adds that html markup to the dom. It will add it to the dom of the page where the script tag resides.
There are a number of jQuery ajax calls that can accomplish this, for example:
jQuery.get('http://ads.adserver.com/ad?site=1233&zone=45435', function(data){jQuery(updateElementQuery).html(data);});
Try to use jquery method 'load'
`$('#result').load(url, function(response) {
$('#my-li').append($('#result').html());
$('#result').html(null);
});`
... where #result is some hidden div

Categories