Cannot Execute Javascript XPath queries on created document - javascript

Problem
I'm creating a document with javascript and I'd like to execute XPath queries on this document.
I've tried this in safari/chrome
I've read up on createDocument / xpath searches and it really seems like this code should work
At this point it seems like it may be a webkit bug
My requirements:
I can use innerHTML() to setup the document
I can execute xpath searches w tagnames
The code:
If you copy/paste the following into the webkit inspector, you should be able to repro.
function search(query, root) {
var result = null;
result = document.evaluate(query, root, null, 7,null);
var nodes = [];
var node_count = result.snapshotLength;
for(var i = 0; i < node_count; i++) {
nodes.push(result.snapshotItem(i));
}
return nodes;
}
x = document.implementation.createDocument('http://www.w3.org/1999/xhtml', 'html', 'HTML');
body = x.createElement('body');
body.innerHTML = "<span class='mything'><a></a></span>";
xdoc = x.documentElement; //html tag
xdoc.appendChild(body);
console.log(search(".", xdoc)); // --> [<html>​…​</html>​]
console.log(search("/*", xdoc)); // --> [<html>​…​</html>​]
console.log(search("/html", xdoc)); // --> []
Best Guess
So I can definitely search using XPath, but I cannot search using tagnames. Is there something silly I'm missing about the namespace?

Have you tried:
console.log(search("//html", xdoc));
I'm not familiar with Safari specifically, but the problem might be that Safari is adding another node above HTML or something. If this was the case, the first two queries might be showing you that node plus it's children, which would make it look like they're working properly, while the third query would fail because there wouldn't be a root=>HTML node.
Just a thought.

Related

What is the best way to add big amount of html elements using innerHTML idea?

Not on Chrome, but on Firefox the browser just get freeze every time I make an ajax request.
Here's the deal...
The ajax request receive a huge html, where the length is more than 75,000.
<div> ... <table> ... etc ... </table> ... </div>
So I start to use replace to get something better:
var html = data.replace(/\r?\n|\r/g, '').replace(/\s{2,}/g, ' ')
Than I got 55,000 that is not enough.
So I've been searching but until now I got nothing that can help.
Here's what I tried:
1.
asyncInnerHTML(html, function(fragment){
$(tab).get(0).appendChild(fragment); // myTarget should be an element node.
});
2.
var node = document.createTextNode(html);
$(tab).get(0).innerHTML = node;
3.
$(tab).get(0).innerHTML = html;
4.
$(tab).append(html);
5.
$(tab).html(html);
The only thing that was fast what the second one, where the javascript add the nodeContent, of course that was not what I want, because I need the HTML rendered and not the html in text/string form.
I hope that someone could help me.
Anyway, thanks.
Here's a piece of code that parses out the rows from your table HTML and then adds them one at a time to give the browser more time to breathe while parsing your HTML. This parsing logic is specific to your HTML and makes some assumptions about that HTML:
function addLargeHTML(parent, h) {
var d = document.createElement("div");
var pieces = extractRows(html);
d.innerHTML = pieces.core;
var tbody = $(d).find("tbody");
$(parent).append(d);
var cntr = 0;
function next() {
if (cntr < pieces.rows.length) {
tbody.append(pieces.rows[cntr]);
++cntr;
setTimeout(next, 1);
}
}
next();
}
function extractRows(h) {
var body;
var h = h.replace(/<tbody>(.*?)<\/tbody>/, function(match, p1) {
body = p1;
return "<tbody></tbody>";
});
var rows = body.match(/<tr.*?<\/tr>/g);
return {core: h, rows: rows};
}
You can see it working in this jsFiddle (select the "Add Row by Row" button): http://jsfiddle.net/jfriend00/z7jn4p12/.
Since I could not reproduce your original problem in Firefox, I can't really say whether this would fix what you saw or not. But, it does break the HTML up into smaller pieces so if that was really the problem, this should help.
The Skype Toolbar for Firefox is an extension that detects phone numbers in web pages, and re-renders them as a clickable button that can be used to dial the number using the Skype desktop application.
So, when the HTML is rendered, the skype extension try to find shit at my code making the browser stop work for a moment.
Thanks for all help.

How to alter DOM with xmldom by XPath in node.js?

I am trying to alter a DOM structure in node.js. I can load the XML string and alter it with the native methods in xmldom (https://github.com/jindw/xmldom), but when I load XPath (https://github.com/goto100/xpath) and try to alter the DOM via that selector, it does not work.
Is there another way to do this out there? The requirements are:
Must work both in the browser and server side (pure js?)
Cannot use eval or other code execution stuff (for security)
Example code to show how I am trying today below, maybe I simply miss something basic?
var xpath = require('xpath'),
dom = require('xmldom').DOMParser;
var xml = '<!DOCTYPE html><html><head><title>blah</title></head><body id="test">blubb</body></html>';
var doc = new dom().parseFromString(xml);
var bodyByXpath = xpath.select('//*[#id = "test"]', doc);
var bodyById = doc.getElementById('test');
var h1 = doc.createElement('h1').appendChild(doc.createTextNode('title'));
// Works fine :)
bodyById.appendChild(h1);
// Does not work :(
bodyByXpath.appendChild(h1);
console.log(doc.toString());
bodyByXpath is not a single node. The fourth parameter to select, if true, will tell it to only return the first node; otherwise, it's a list.
As aredridel states, .select() will return an array by default when you are selecting nodes. So you would need to obtain your node from that array.
You can also use .select1() if you only want to select a single node:
var bodyByXpath = xpath.select1('//*[#id = "test"]', doc);

Disable a "link" tag without access to HTML

So I have a website I'm working on and in one part we have an embedded newsletter signup. The problem is that the embedded code uses all its own stylesheets which interferes with the design of the site. The embed is done through javascript so I cannot disable them until that section of the page loads.
Basically I need a script to disable an entire <link>. On top of that, the links don't have any classes or ids so they are hard to target.
<link rel=​"stylesheet" type=​"text/​css" href=​"http:​/​/​www.formstack.com/​forms/​css/​3/​default.css?20130404">​
This is one of the links I need to disable. I tried looking for something like a getElementByType or similar but I couldn't find anything.
Any help would be appreciated. As long as the code disables the link that's good enough for me. Maybe there is a way to search the document for the <link> string and surround it with comments?
Thanks guys
PS, I'm a javascript novice and have no idea what I'm doing with js
var test = "http:​/​/​www.formstack.com/​forms/​css/​3/​default.css";
for (var i = 0; i < document.styleSheets.length; i++) {
var sheet = document.styleSheets.item(i);
if (sheet.href.indexOf(test) !== -1) sheet.disabled = true;
}
this will work, however it is inefficient (still) as it continues to check additional CSSStyleSheets in the CSSStyleSheetList after it has found it's match.
if you can not care about browser support you can use Array.prototype.some to reduce the number of ops
[].some.call(document.styleSheets, function(sheet) {
return sheet.disabled = sheet.href.indexOf(test) !== -1;
});
see: Array some method on MDN
edit:
For a mix of performance AND legacy support the following solution would work:
var test = "http:​/​/​www.formstack.com/​forms/​css/​3/​default.css";
for (var i = 0; i < document.styleSheets.length; i++) {
var sheet = document.styleSheets.item(i);
if (sheet.href.indexOf(test) !== -1) {
sheet.disabled = true;
break;
}
}

getElementById javascript undefined

I am currently working on this script for Greasemonkey. The goal of this script is to remove posts from specific users on the feed of the website MeetMe.com.
My code is:
// ==UserScript==
// #name PostBeGone
// #namespace TestingNameSpace
// #include http://www.meetme.com/apps/home
// #version 1
// ==/UserScript==
var posterId;
var blacklist = new Array();
var toDelete;
blacklist [0] = 45112400; //These are just random peoples' user Id's that I am
blacklist [1] = 9649820; //using to test this script
blacklist [2] = 55907221;
blacklist [3] = 56788411;
window.onload = function checkAndRemove () {
var children = document.getElementById('feedReloadArea').childnodes;
alert(children); //alert says "undefined"
i = 0;
While (i < children.length)
{
posterId = children[i].getAttribute('data-poster');
toDelete = null;
i2 = 0;
while (i2 < blacklist.length)
{
if (posterId == blacklist[i2])
{
toDelete = children[i];
break;
}
i2++;
}
if (toDelete != null)
{
toDelete.parentNode.removeChild(toDelete);
}
i++;
}
}
Using various alerts with multiple executions, I know that the code executes up to the point where I have alert(children), which is returning undefined.
Prior to having window.onload = in my script, in scouring Google and this website for answers, I read many places that the problem might be that the script could be trying to execute before the page was loaded, so I added the window.onload =. The problem persisted, however, and I can't find any questions similar enough to mine to make sense of it.
To see the html code for the elements, I've been using Firefox's "Inspect Element" feature. A snippet of the html code on the page that may help is:
<div id="feedReloadArea" style="display: block;">
<div class="feedItemArea feedSpotlightHighlight" data-comment-maintenance="0" data-created-at="1360098729.05091" data-numeric-reference-id="" data-reference-uuid="0d76957f-2f64-4654-99ea-f67974116b32" data-entity="StatusUpdate" data-poster="59538173" data-uuid="b9e85465-5b91-4b5c-a443-c4e37d716481"></div><div class="feedItemArea" data-comment-maintenance="0" data-created-at="1360103761.624714" data-numeric-reference-id="" data-reference-uuid="c2c201ca-a391-4fff-aec2-9823a3e90815" data-entity="StatusUpdate" data-poster="45508368" data-uuid="11149a32-38cf-4919-b081-0bd39bdc49eb"></div>
<div class="feedItemArea" data-comment-maintenance="0" data-created-at="1360103737.756343" data-numeric-reference-id="" data-reference-uuid="997824a2-994c-467f-bfbe-aa4beb1e402f" data-entity="StatusUpdate" data-poster="38033716" data-uuid="bc075882-771c-4ee2-ad28-dbf21fbf3bd3"></div>
To clarify, that is only part of the HTML code. There are many of those classes on the feed, as each one represents one user's post. Also, each class has multiple within it, for things like the user's link and the user's profile picture. I, however, am wanting to remove full feedItemArea classes that have the same poster id as any that are on the blacklist.
I hope that I have been clear and concise enough to be easy to help, but if any other information is needed in order to help me, let me know and I will post it. My question is what is causing children to be undefined? Thanks in advance.
JavaScript is case-sensitive, and it's childNodes instead of childnodes. Btw, since you probably want to iterate element nodes only (no text nodes, comments, etc.), use the children collection. Also, you hardly need the window.onload, since GreaseMonkey scripts are executed on DOMready by default.
Capitalization!
var children = document.getElementById('feedReloadArea').childnNodes;

Extract all links from a string

I have a javascript variable containing the HTML source code of a page (not the source of the current page), I need to extract all links from this variable.
Any clues as to what's the best way of doing this?
Is it possible to create a DOM for the HTML in the variable and then walk that?
I don't know if this is the recommended way, but it works: (JavaScript only)
var rawHTML = '<html><body>barzort</body></html>';
var doc = document.createElement("html");
doc.innerHTML = rawHTML;
var links = doc.getElementsByTagName("a")
var urls = [];
for (var i=0; i<links.length; i++) {
urls.push(links[i].getAttribute("href"));
}
alert(urls)
If you're using jQuery, you can really easily I believe:
var doc = $(rawHTML);
var links = $('a', doc);
http://docs.jquery.com/Core/jQuery#htmlownerDocument
This is useful esepcially if you need to replace links...
var linkReg = /(<[Aa]\s(.*)<\/[Aa]>)/g;
var linksInText = text.match(linkReg);
If you're running Firefox YES YOU CAN ! It's called DOMParser , check it out:
DOMParser is mainly useful for applications and extensions based on Mozilla platform. While it's available to web pages, it's not part of any standard and level of support in other browsers is unknown.
If you are running outside a browser context and don't want to pull a HTML parser dependency, here's a naive approach:
var html = `
<html><body>
Example
<p>text</p>
<a download href='./doc.pdf'>Download</a>
</body></html>`
var anchors = /<a\s[^>]*?href=(["']?)([^\s]+?)\1[^>]*?>/ig;
var links = [];
html.replace(anchors, function (_anchor, _quote, url) {
links.push(url);
});
console.log(links);

Categories