How to normalize HTML in JavaScript or jQuery?

How to normalize HTML in JavaScript or jQuery? - javascript

Tags can have multiple attributes. The order in which attributes appear in the code does not matter. For example:
<a href="#" title="#">
<a title="#" href="#">
How can I "normalize" the HTML in Javascript, so the order of the attributes is always the same? I don't care which order is chosen, as long as it is always the same.
UPDATE: my original goal was to make it easier to diff (in JavaScript) 2 HTML pages with slight differences. Because users could use different software to edit the code, the order of the attributes could change. This make the diff too verbose.
ANSWER: Well, first thanks for all the answers. And YES, it is possible. Here is how I've managed to do it. This is a proof of concept, it can certainly be optimized:
function sort_attributes(a, b) {
if( a.name == b.name) {
return 0;
}
return (a.name < b.name) ? -1 : 1;
}
$("#original").find('*').each(function() {
if (this.attributes.length > 1) {
var attributes = this.attributes;
var list = [];
for(var i =0; i < attributes.length; i++) {
list.push(attributes[i]);
}
list.sort(sort_attributes);
for(var i = 0; i < list.length; i++) {
this.removeAttribute(list[i].name, list[i].value);
}
for(var i = 0; i < list.length; i++) {
this.setAttribute(list[i].name, list[i].value);
}
}
});
Same thing for the second element of the diff, $('#different'). Now $('#original').html() and $('#different').html() show HTML code with attributes in the same order.

JavaScript doesn't actually see a web page in the form of text-based HTML, but rather as a tree structure known as the DOM, or Document Object Model. The order of HTML element attributes in the DOM is not defined (in fact, as Svend comments, they're not even part of the DOM), so the idea of sorting them at the point where JavaScript runs is irrelevant.
I can only guess what you're trying to achieve. If you're trying to do this to improve JavaScript/page performance, most HTML document renderers already presumably put a lot of effort into optimising attribute access, so there's little to be gained there.
If you're trying to order attributes to make gzip compression of pages more effective as they're sent over the wire, understand that JavaScript runs after that point in time. Instead, you may want to look at things that run server-side instead, though it's probably more trouble than it's worth.

Take the HTML and parse into a DOM structure. Then take the DOM structure, and write it back out to HTML. While writing, sort the attributes using any stable sort. Your HTML will now be normalized with regard to attributes.
This is a general way to normalize things. (parse non-normalized data, then write it back out in normalized form).
I'm not sure why you'd want to Normalize HTML, but there you have it. Data is data. ;-)

This is a proof of concept, it can certainly be optimized:
function sort_attributes(a, b) {
if( a.name == b.name) {
return 0;
}
return (a.name < b.name) ? -1 : 1;
}
$("#original").find('*').each(function() {
if (this.attributes.length > 1) {
var attributes = this.attributes;
var list = [];
for(var i =0; i < attributes.length; i++) {
list.push(attributes[i]);
}
list.sort(sort_attributes);
for(var i = 0; i < list.length; i++) {
this.removeAttribute(list[i].name, list[i].value);
}
for(var i = 0; i < list.length; i++) {
this.setAttribute(list[i].name, list[i].value);
}
}
});
Same thing for the second element of the diff, $('#different'). Now $('#original').html() and $('#different').html() show HTML code with attributes in the same order.

you can try open HTML tab in firebug, the attributes are always in same order

Actually, I can think of a few good reasons. One would be comparison for identity matching and for use with 'diff' type tools where it is quite annoying that semantically equivalent lines can be marked as "different".
The real question is "Why in Javascript"?
This question "smells" of "I have a problem and I think I have an answer...but I have a problem with my answer, too."
If the OP would explain why they want to do this, their chances of getting a good answer would go up dramatically.

The question "What is the need for this?"
Answer: It makes the code more readable and easier to understand.
Why most UI sucks... Many programmers fail to understand the need for simplifying the users job. In this case, the users job is reading and understanding the code.
One reason to order the attributes is for the human who has to debug and maintain the code. An ordered list, which the program becomes familiar with, makes his job easier. He can more quickly find attributes, or realize which attributes are missing, and more quickly change attribute values.

This only matters when someone is reading the source, so for me it's semantic attributes first, less semantic ones next...
There are exceptions of course, if you have for example consecutive <li>'s, all with one attribute on each and others only on some, you may want to ensure the shared ones are all at the start, followed by individual ones, eg.
<li a="x">A</li>
<li a="y" b="t">B</li>
<li a="z">C</li>
(Even if the "b" attribute is more semantically useful than "a")
You get the idea.

it is actually possible, I think, if the html contents are passed as xml and rendered through xslt... therefore your original content in XML can be in whatever order you want.

Related

Is there a jQuery UI "anti-sortable" that will let LI's be taken and from a UL, but always specifying order?

I have something partially working where there are two UL's, both having had jQuery UI .sortable() called on it. The user can and potentially should drop LI's from one to the other. I am looking to have the second list really be sortable, but the first list retain a single ordering instead of having a LI from the second list appended at the end if the user clicks on it.
I see one painfully obvious way to do it: keep a JavaScript list of values of LI's, or alternately set a data-index='0' (then, 1, 2, 3, etc.), and then in either case make a single, possibly bottom-up, sweep of the dread bubble sort.
This appears to me something I could straightforwardly get working, but it has an "If you're doing it this way, you're working too low-level" code smell to me. Apart from a bubble sort reference, in a case where I think O(n) really is tolerable, it seems like something where someone who knew jQuery UI could produce a much shorter and clearer implementation.
I've outlined above the hard way of addressing my problem. What easy ways should I consider instead.

After a little more hesitation, I decided to go with the obvious solution, even if it doesn't smell like a usual jQuery optimal solution. I added an ascending integer data- field to the original LI's in the server-side code that generates them, and client-side created a single bubble sort iteration function that should work if all but the last item (re-added items start out last) are in order:
var get_index = function(node)
{
return parseInt(jQuery(node).attr('data-sort-position'));
};
var sort_available = function()
{
var len = jQuery('#available > li').length;
if (len >= 2)
{
var moved = false;
for(var index = 0; index < len; index += 1)
{
if (!moved)
{
if (get_index(jQuery('#available > li')[index]) >
get_index(jQuery('#available > li')[len - 1]))
{
var node = jQuery(
jQuery('#available > li')[len - 1]).detach();
jQuery(node).insertBefore(
jQuery('#available > li')[index]);
moved = true;
}
}
}
}
};
I then added a sort_available() call to the end of the handler that is called when a member of the other, destination list is clicked:
jQuery('#included li').click(function(event)
{
jQuery(event.target).detach();
jQuery('#available').append(event.target);
assign_clicks();
sort_available();
});
Now it seems to be putting items back in their original places.

Updating page elements with JS efficiently

I am making a game and using hundreds of lines of
document.getElementById("ID").innerHTML = someVariable;
to update everything I calculated with functions. How could I make it better or is this the best method? I can't really use loops for it like
for (var i = 0; i < IDs.length; i++) {
document.getElementById("IDs[i]").innerHTML = Variables[i];
}
beacuse I have so much different variables and diferent ids.
So should I rework everything into arrays and use them or what?
Thanks for any advice!

There are several things that you might do improve performance a little. Unfortunately jsperf is down for maintenance at the moment, but I'm not sure that it would help much because the efficiency of DOM manipulations varies so widely between browsers and the particulars of an application.
The first thing I would suggest is a set of "micro efficiencies" which may not even help much but are good practice nonetheless. Keep an array of references to your DOM elements so you don't have to call getElementById often. While getElementById is very fast, it is still a function call which can be avoided. You might think that the array will take up a lot of memory, but you aren't actually storing the DOM elements in the array but rather storing pointers to DOM elements. Also, you can also keep a reference to the length of items so that it's not calculated on every loop iteration:
const myDivs = [/* populate this array with your DOM elements */];
for (var i = 0, l = IDs.length; i < l; i++) {
myDivs[i].innerHTML = Variables[i];
}
Another thing which is important is to wait for an animation frame before doing a heavy DOM manipulation. You can read more about requestAnimationFrame here.
Finally, you are just going to have to test. As someone suggested in a comment your best bet is to do direct DOM manipulation without using innerHTML. The reason for this is because innerHTML requires parsing and render tree construction which is expensive. An example of direct DOM manipulation would be... let's say you want to go from this state:
<div id="ID[1]">
<img src="foo.jpg" />
</div>
... to this state:
<div id="ID[1]">
<h1>Hello</h1>
<img src="foo.jpg" />
</div>
We just added an H1 - you would write something like this:
const h1 = document.createElement('h1');
h1.innerText = 'Hello';
const div = document.getElementById('ID[1]');
div.insertBefore(h1, div.firstChild);
But as you can see, this requires you to do a "diff" between the before and after states and calculate the most efficient way to get from one to the other. It just so happens that this is what ReactJS and other Virtual DOM libraries do very efficiently - so you might try out one of those libraries.
If you are against using a library or think that your DOM will fluctuate too much for in-memory diffing to be efficient, then you might try constructing a DOM string and using innerHTML as few times as possible. For example, if your DOM looks like this:
<div id="main-container">
<div id="ID[1]">...</div>
<div id="ID[2]">...</div>
<div id="ID[3]">...</div>
...
</div>
Then try doing something like the following:
let html = '';
const container = document.getElementById('main-container');
for (let i = 0, l = IDs.length; i < l; i++) {
html += `<div id="ID[${i}]">${Variables[i]}</div>`;
}
// you want to set innerHTML as few times as possible
container.innerHTML = html;

What is better, appending new elements via DOM functions, or appending strings with HTML tags?

I have seen a few different methods to add elements to the DOM. The most prevelent seem to be, for example, either
document.getElementById('foo').innerHTML ='<p>Here is a brand new paragraph!</p>';
or
newElement = document.createElement('p');
elementText = document.createTextNode('Here is a brand new parahraph!');
newElement.appendChild(elementText);
document.getElementById('foo').appendChild(newElement);
but I'm not sure of the advantages to doing either one. Is there a rule of thumb as to when one should be done over the other, or is one of these just flat out wrong?

Some notes:
Using innerHTML is faster in IE, but slower in chrome + firefox. Here's one benchmark showing this with a constantly varying set of <div>s + <p>s; here's a benchmark showing this for a constant, simple <table>.
On the other hand, the DOM methods are the traditional standard -- innerHTML is standardized in HTML5 -- and allow you to retain references to the newly created elements, so that you can modify them later.
Because innerHTML is fast (enough), concise, and easy to use, it's tempting to lean on it for every situation. But beware that using innerHTML detaches all existing DOM nodes from the document. Here's an example you can test on this page.
First, let's create a function that lets us test whether a node is on the page:
function contains(parent, descendant) {
return Boolean(parent.compareDocumentPosition(descendant) & 16);
}
This will return true if parent contains descendant. Test it like this:
var p = document.getElementById("portalLink")
console.log(contains(document, p)); // true
document.body.innerHTML += "<p>It's clobberin' time!</p>";
console.log(contains(document, p)); // false
p = document.getElementById("portalLink")
console.log(contains(document, p)); // true
This will print:
true
false
true
It may not look like our use of innerHTML should have affected our reference to the portalLink element, but it does. It needs to be retrieved again for proper use.

There are a number of differences:
innerHTML has only been standardised by the W3C for HTML 5; even though it has been a de facto standard for some time now across all popular browsers, technically in HTML 4 it's a vendor extension that standards-adherent developers would never be caught dead using. On the other hand, it's much more convenient and practically it's supported by all browsers.
innerHTML replaces the current content of the element (it does not let you modify it). But again, you gain in convenience if you don't mind this limitation.
innerHTML has been measured to be much faster (admittedly, that test involves older versions browsers that are not widely used today).
innerHTML might represent a security risk (XSS) if it's set to a user-supplied value that has not been properly encoded (e.g. el.innerHTML = '<script>...').
Based on the above, it seems that a practical conclusion might be:
If you don't mind the fact that innerHTML is a bit limiting (only total replacement of DOM sub-tree rooted at target element) and you don't risk a vulnerability through injecting user-supplied content, use that. Otherwise, go with DOM.

Though this is an old thread, one thing that is not mentioned is the while innerHTML can be faster, care should be taken. Using innerHTML will render every child of the modified element, old and new alike. As such, one single innerHTML assignment is faster (slightly) than DOM create/append, but multiple innerHTML will definetly be slower.
For example:
for(let i=0; i < 10; i++)
document.body.innerHTML+='<div>some text</div>';
will be nearly nearly 5x slower than
let html = '';
for(let i=0; i < 10; i++)
html += '<div>some text</div>';
document.body.innerHTML = html;
Since innerHTML assignment is letting the browser natively create/append elements, the second methods results in 10 elements being natively created/appended, while the firstmethod results in 55 elements being created/appended (and 45 being destroyed): 1 element created on first loop-iteration, 2 elements created on the second loop-iteration (the original being destroyed), 3 elements created on the third loop-iteration (the previous 2 being destroyed), and so on.
If you use innerHTML for speed, you must make sure to create the entire html string first before making the innerHTML assignment, such as creating fresh DOM containers/elements. innerHTML, on the other hand, is a performance loser when appending any container with existing childNodes, especially those with large number of childNodes.

According to this benchmark data, you will receive much faster results with innerHTML than creating DOM elements. It's especially clear when using older IE versions.

First one is straight forward, easier to read, less code and might be faster.
Second one gives you much more control over the element you create, i.e. makes it much easier to modify the new Element using JS (like attaching events, or, just use it in your code).
Second way is for "purist" who like "clean" code (no quick and dirty).
I say, use both, see what fits you better and go with it.

I always prefer readability unless the perf difference is extreme. In a one-off case of this, it probably will be a marginal difference.
In a one-off case like this, setting the innerHTML property will be easiest to read.
But if you are doing a lot of programmatic content generation in JavaScript, it is cleaner and easier to read and understand the DOM option.
Example:
Compare this innerHTML code:
http://jsfiddle.net/P8m3K/1/
// Takes input of a value between 1 and 26, inclusive,
// and converts it to the appropriate character
function alphaToChar(alpha)
{
return String.fromCharCode('a'.charCodeAt() + alpha - 1);
}
var content = "<ul>";
for(i = 0; i < 10; ++i)
{
content += "<li>";
for(j = 1; j <= 26; ++j)
{
content += "<a href=\"" + alphaToChar(j) + ".html\">"
+ alphaToChar(j)
+ "</a>";
}
content += "</li>";
}
document.getElementById("foo").innerHTML = content;
To this DOM code:
http://jsfiddle.net/q6GB8/1/
// Takes input of a value between 1 and 26, inclusive,
// and converts it to the appropriate character
function alphaToChar(alpha)
{
return String.fromCharCode('a'.charCodeAt() + alpha - 1);
}
var list = document.createElement("ul");
for(i = 0; i < 10; ++i)
{
var item = document.createElement("li");
for(j = 1; j <= 26; ++j)
{
var link = document.createElement("a");
link.setAttribute("href", alphaToChar(j) + ".html");
link.innerText = alphaToChar(j);
item.appendChild(link);
}
list.appendChild(item);
}
document.getElementById("foo").appendChild(list);
At this level they start to become quite similar length wise.
But the DOM code will be easier to maintain, and you're a bit less likely to make a typo or mistake that is hard to diagnose, like omitting a closing tag. Either your elements will be in your document, or they won't.
With more complicated scenarios (like building treed menus), you'll probably come out ahead with DOM code.
With scenarios where you have to append multiple types of content together to build a document with more heterogeneous content, it becomes a slam dunk. You don't have to ensure you call your child append code before calling the parent append code.
With scenarios where add, remove, or modify existing static content, DOM will usually win.
If you start doing complicated DOM modifications (one of the last things I mentioned), you'll definitely want to check out a library built around DOM modifications, like jQuery.

how to using "For" instead of "each" function in jquery

Today i'm very stack with a Work and jQ. I was get a morning for it but i can't resolve it :(.
My Work here:
<div class="container">
<p class="test">a</p>
<div>
<p class="test">a</p>
</div>
</div>
In normal, i can using jQ with each function for select all <p class="test">a</p> EX:
$(".test").each(function() {
$(this).text('a');
});
But i hear everyone talk that, for function get a less timeload than each function. Now i want using for instead of each.. but i don't know how to write code jQ in this case.
Somebody can help me!. thankyou!

I wouldn't worry about it unless you were iterating through hundreds of them.
for loop is usually used with normal DOM (aka without jQuery) traversing, like...
var elements = document.getElementById('something').getElementsByTagName('a');
var elementsLength = elements.length;
for (var i = 0; i < elementsLength; i++) {
elements[i].style.color = 'red';
}
Caching of elementsLength is a good idea so it is not calculated every iteration. Thanks to CMS for this suggestion in the comments.
Just adapt that for your jQuery object if you wanted to do it with jQuery.
Replace elements variable with your jQuery collection, like $('#something a'). I think you may need to rewrap the object if you need to do any more jQuery stuff with it.

One thing to watch out for is that using an ordinal accessor on the result of a jQuery selection will return a native DomElement. If you want to use jQuery methods on them, you have to re-wrap them:
var testElements = $('.test');
for (var i = 0; i < testElements.length; i++) {
// Using $() to re-wrap the element.
$(testElements[i]).text('a');
}
I'd second what others have said though. Unless you're dealing with many elements, this is premature optimization. Re-wrapping the elements to use the .text() method may even bring it back to no gain at all.

have you tried the obvious solution?
var nodes = $(".test");
for(var i = 0; i < nodes.length; i++)
{
var node = nodes[i];
}

This article shows that each() has no significant performance penalty until you get into the hundreds of thousands of looped-over items.

Another alternative:
for (var i = 0; i < $('.test').length; i++){
var element = $('.test').eq(i);
}

Removing items from data bound array

How do I remove an items from a data bound array? My code follows.
for(var i = 0; i < listBox.selectedIndices.length; i++) {
var toRemove = listFiles.selectedIndices[i];
dataArray.splice(toRemove, 1);
}
Thanks in advance!
Edit Here is my swf. The Add Photos works except when you remove items.
http://www.3rdshooter.com/Content/Flash/PhotoUploader.html
Add 3 photos different.
Remove 2nd photo.
Add a different photo.
SWF adds the 2nd photo to the end.
Any ideas on why it would be doing this?
Edit 2 Here is my code
private function OnSelectFileRefList(e:Event):void
{
Alert.show('addstart:' + arrayQueue.length);
for each (var f:FileReference in fileRefList.fileList)
{
var lid:ListItemData = new ListItemData();
lid.fileRef = f;
arrayQueue[arrayQueue.length]=lid;
}
Alert.show('addcomplete:' + arrayQueue.length);
listFiles.executeBindings();
Alert.show(ListItemData(arrayQueue[arrayQueue.length-1]).fileRef.name);
PushStatus('Added ' + fileRefList.fileList.length.toString() + ' photo(s) to queue!');
fileRefList.fileList.length = 0;
buttonUpload.enabled = (arrayQueue.length > 0);
}
private function OnButtonRemoveClicked(e:Event):void
{
for(var i:Number = 0; i < listFiles.selectedIndices.length; i++) {
var toRemove:Number = listFiles.selectedIndices[i];
//Alert.show(toRemove.toString());
arrayQueue.splice(toRemove, 1);
}
listFiles.executeBindings();
Alert.show('removecomplete:' + arrayQueue.length);
PushStatus('Removed photos from queue.');
buttonRemove.enabled = (listFiles.selectedItems.length > 0);
buttonUpload.enabled = (arrayQueue.length > 0);
}

It would definitely be helpful to know two things:
Which version of ActionScript are you targeting?
Judging from the behavior of your application, the error isn't occurring when the user removes an item from the list of files to upload. Looks more like an issue with your logic when a user adds a new item to the list. Any chance you could post that code as well?
UPDATE:
Instead of: arrayQueue[arrayQueue.length]=lid
Try: arrayQueue.push(lid)
That will add a new item to the end of the array and push the item in to that spot.
UPDATE 2:
Ok, did a little more digging. Turns out that the fileList doesn't get cleared every time the dialog is opened (if you're not creating a new instance of the FileReferenceList each time the user selects new files). You need to call splice() on the fileList after you add each file to your Array.
Try something like this in your AddFile() method...
for(var j:int=0; j < fileRefList.fileList.length; j++)
{
arrayQueue.push(fileRefList.fileList[j]);
fileRefList.fileList.splice(j, 1);
}
That will keep the fileList up to date rather than holding on to previous selections.

I see one issue. The selected indices are no longer valid once you have spliced out the first element from the array. But that should only be a problem when removing multiple items at once.
I think we need to see more code about how you are handling the upload before we can figure out what is going on. It looks to me like you are holding a reference to the removed FileReference or something. The described problem is occurring when you upload a new file, not when you remove the selected one.

Do you mean to use listBox and listFiles to refer to the same thing?
I'm stepping out on a limb here, because I don't have a ton of experience with JavaScript, but I'd do this the same way that I'd do it in C, C++, or Java: By copying the remaining array elements down into their new locations.
Assuming that listFiles.selectedIndices is sorted (and its contents are valid indices for dataArray), the code would be something like the following:
(WARNING: untested code follows.)
// Don't bother copying any elements below the first selected element.
var writeIndex = listFiles.selectedIndices[0];
var readIndex = listFiles.selectedIndices[0] + 1;
var selectionIndex = 1;
while(writeIndex < (dataArray.length - listFiles.selectedIndices.length)) {
if (selectionIndex < listFiles.selectedIndices.length) {
// If the read pointer is currently at a selected element,
// then bump it up until it's past selected range.
while(selectionIndex < listFiles.selectedIndices.length &&
readIndex == listFiles.selectedIndices[selectionIndex]) {
selectionIndex++;
readIndex++;
}
}
dataArray[writeIndex++] = dataArray[readIndex++];
}
// Remove the tail of the dataArray
if (writeIndex < dataArray.length) {
dataArray.splice(writeIndex, dataArray.length - writeIndex);
}
EDIT 2009/04/04: Your Remove algorithm still suffers from the flaw that as you remove items in listFiles.selectedIndices, you break the correspondence between the indices in arrayQueue and those in listFiles.selectedIndices.
To see this, try adding 3 files, then doing "Select All" and then hit Remove. It will start by removing the 1st file in the list (index 0). Now what had been the 2nd and 3rd files in the list are at indices 0 and 1. The next value taken from listFiles.selectedIndices is 1 -- but now, what had been the 3rd file is at index 1. So the former File #3 gets spliced out of the array, leaving the former 2nd file un-removed and at index 0. (Using more files, you'll see that this implementation only removes every other file in the array.)
This is why my JavaScript code (above) uses a readIndex and a writeIndex to copy the entries in the array, skipping the readIndex over the indices that are to be deleted. This algorithm avoids the problem of losing correspondence between the array indices. (It does need to be coded carefully to guard against various edge conditions.) I tried some JavaScript code similar to what I wrote above; it worked for me.
I suspect that the problem in your original test case (removing the 2nd file, then adding another) is analogous. Since you've only shown part of your code, I can't tell whether the array indices and the data in listFiles.selectedIndices, arrayQueue, and fileRefList.fileList are always going to match up appropriately. (But I suspect that the problem is that they don't.)
BTW, even if you fix the problem with using splice() by adjusting the array index values appropriately, it's still an O(N2) algorithm in the general case. The array copy algorithm is O(N).

I'd really need to see the whole class to provide a difinitive answer, but I would write a method to handle removing multiple objects from the dataProvider and perhaps assigning a new array as the dataProvider for the list instead of toying with binding and using the same list for the duration. Like I said, this is probably inefficient, and would require a look at the context of the question, but that is what I would do 9unless you have a big need for binding in this circumstance)
/**
* Returns a new Array with the selected objects removed
*/
private function removeSelected(selectedItems:Array):Array
{
var returnArray:Array = []
for each(var object:Object in this.arrayQueue)
{
if( selectedItems.indexOf(object)==-1 )
returnArray.push( object )
}
return returnArray;
}

You might be interested in this blog entry about the fact that robust iterators are missing in the Java language.
The programming language, you mentioned Javascript, is not the issue, it's the concept of robust iterators that I wanted to point out (the paper actually is about C++ as the programming language).
The [research document]() about providing robust iterators for the ET++ C++ framework may still e helpful in solving your problem. I am sure the document can provide you with the necessary ideas how to approach your problem.

We Keep Coding

JavaScript is the programming language of the Web.

How to normalize HTML in JavaScript or jQuery? - javascript

you can try open HTML tab in firebug, the attributes are always in same order

it is actually possible, I think, if the html contents are passed as xml and rendered through xslt... therefore your original content in XML can be in whatever order you want.

Related

Is there a jQuery UI "anti-sortable" that will let LI's be taken and from a UL, but always specifying order?

Updating page elements with JS efficiently

What is better, appending new elements via DOM functions, or appending strings with HTML tags?

how to using "For" instead of "each" function in jquery

Removing items from data bound array

Categories

Resources