Using this post, I'm trying to load document via ajax and find contents of specific document node(s) so that I can display them without re-navigating browser.
However, my document always seems to be an empty document.
Ajax callback:
function processRatingToken(data) { //Data is just standart HTML document string
var doc = document.implementation.createHTMLDocument();
doc.open();
//Replace scripts
data = data.replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi, "");
//Write HTML to the new document
doc.write(data);
doc.close();
console.log(doc.body); //Empty
}
So what's wrong?
Note: I'm using this strategy, because I'm building a Greasemonkey Userscript. If you are developing an Ajax application, this strategy is NOT recomended. Use JSON instead.
There is a workaround with .innerHTML property:
doc.childNodes[1].innerHTML = data;
Where .childNodes[1] is the <html> element.
Related
I am currently implementing a chrome extension to parse certain websites. I came across a site whose contents are generated by inline/external js code (I Think!). How can I parse a website of this kind. I am trying to extract the whole page through XMLHttpRequest() inside my parser. I tried using eval() and html() of Jquery. With Jquery I could parse some of the elements, but inaccurate.
sample code of my parser:
var siteaddress="www.xyz.com/search?q=abcd";
var req = new XMLHttpRequest()
req.open('GET',siteaddress,true)
parseHT(req,x);
req.send(null);
function parseHT(req_new,x){
req_new.onload=function(){
//console.log(this.responseText);
var jshtml=req_new.responseText;
var el = $( '<div></div>' );
html=el.html(jshtml)
//process steps follows this
Thanks
I'm working on a simple webmail script in php. The content of a message body is retrieved using jQuery which gets the content returned from a php script. For example:
$.get("file.php", function(data) { /* Data is the message content */ });
From here, I'm then writing the string in data to the document of an iFrame. I want to make sure that the content returned is sanitized and one step to this is removing all references to external files, particularly remote files accessed over http. For example, javascript files or images on a server somewhere. It's important to do this because not only may external scripts try to manipulate my page, external images may be running through a dynamic engine like php and confirming to spammers that my email address is active and able to receive mail, and some images can apparently contain viruses.
The following script can remove a lot of things that may be hazardous:
function sanitize(str) {
var html = $(str);
var evil = new Array("head","base","link","script","img","object","embed","video","audio","iframe");
for (e=0; e<evil.length; e++) { html.find(evil[e]).remove(); }
var result = html.wrap("<div>").parent().html();
return result; }
But my question is this: how can I remove a line of css that contains a reference to an external file? For example, if the message body content contained a tag and inside it was this:
background-image: url(http://some/dodgy/server/image.jpg);
how would I remove that line from the string?
has not been tested , but you can try something like
str = str.replace(/background\-image:\s*url\(.*\);\s*/ig, "");
I have to get out information from a HTML table from a website. I want to do a HTML request from a Node.ja server to that website and parse the HTML table. Are there any libraries or techniques for JS except regular expression to parse the data from the table cells?
Sorry I'm very new in programming.
Look at the excellent Cheerio library:
https://github.com/MatthewMueller/cheerio
Examples are on the Git.
var doc = document.implementation.createDocument(null, your_downloaded_html_page_as_string, null);
You can use normal DOM function like getElementByTagName,firstChild,..etc to get your actual data from the HTML page you downloaded.
Refer Parse a HTML String with JS for more methods.
jsdom is a great module for this
// Count all of the links from the Node.js build page
var jsdom = require("jsdom");
jsdom.env(
"http://nodejs.org/dist/",
["http://code.jquery.com/jquery.js"],
function (errors, window) {
console.log("there have been", window.$("a").length, "nodejs releases!");
}
);
I would use JQuery. You could iterate through all table datas like so: (this will alert the html inside every table data)
$('td').each( function () { alert( $(this).html() } );
or for a specific table:
$('#specific_table_id.td').each( function () { alert( $(this).html() } );
I am able to select a linked script element. Is there a way to read the contents into a string?
I have jQuery available.
You can do ajax request and pass script url (src attribute). Server will return js source. Example:
var script = $('selector');
var src = script.attr('src');
if (src.indexOf('://') == -1) {
src = document.location.href.substr(0, document.location.href.lastIndexOf('/') + 1) + src;
}
$.get(src, {}, function(data){
// do something
});
Never tried this before but if you get the script such as
$(scriptselector).text()
Edited.
Just tried it and it seems to work http://jsfiddle.net/UG3hB/
Edit #2:
As Intersteller_Coder pointed out this will not work with linked jsfiles. You can always request the js file via AJAX.
It depends if the script is embedded or linked.
For embedded javascript you could do
var test = $('script').eq( index_here ).text();
For an external javascript file with a source attribute, you could perform an ajax request and get the contents.
$("<div/>").append($("#myAnchor")).text();
Yes, just get the innerText or innerHTML from the element reference.
In an app that i'am creating i have to receive from the server an xml string with this format eg: <reply>
<script>
alert('Hello World!');
</script>
</reply>
when i did this using ajax work perferct, but when i try to receive the data in an iframe i can't extract the data from the frame because is not there, IE and FF open new tabs and append the data on that tab, how i avoid that and makes them insert the data on the frame.
I can do this work still using Javascript, get the result of the ajax and write it inside the iframe:
first create your iframe tag like this:
than the javascript code to insert the ajax:
var t = document.getElementById('iftarget');
h = t.contentWindow.document.getElementsByTagName('html');
h[0].innerHTML = '<h1>Hello</h1> This must work! Put your data here';
I have created a jsFiddle for this
http://jsfiddle.net/nunomazer/JGyEr/
Best Regards