Clientside HTML Minification

Clientside HTML Minification - javascript

Is there a way to this kind of minification with javascript and update the DOM (clientSide)
Input:
<div class="parentDiv">
<div class="childDiv">Some text</div>
<div class="childDiv">Some text</div>
</div>
Output:
<div class="parentDiv"><div class="childDiv">Some text</div><div class="childDiv">Some text</div></div>
I know its useless doing the minification after downloading all the content.
The point here is to stop the identation to create gaps between my divs. I know that if I put a comment between the tags the gap won't appear but it gets difficult to understand the code with so many comments between my div tags.
See this [post] and you'll understand what I mean.

I managed to achieve what I wanted and even created a jQuery plugin to it.
jQuery.fn.clearWhiteSpace = function () {
var htmlClone = this.html().replace(/\n[ ]*/g,"");
this.html(htmlClone);
return this;
}
$(".parentDiv").clearWhiteSpace();
there is an example I wrote in jsfiddle
But thanks for all your effort. :)

If it's a minification the DOM won't update. Also there's nothing client-side minification accomplishes: it's not faster to download and it's not obfuscated from the client.
For what you wrote, you can replace '\n' with '' I guess.

Try this javascript minification script -- http://prettydiff.com/lib/jspretty.js

You need to be careful when parsing documents, especially with special characters in attributes. You can write your own DOM parser, but, why re-invent the wheel?
Here is a great parser, written in JavaScript: https://www.npmjs.com/package/html-minifier
Instructions are documented.
The above method is to "minify" production code; however, if it's a visual spacing issue, then see below:
Update:
"White-space" is mostly ignored when it comes to block-elements.
To ensure that your inline-block elements are not separated by "white-space" you can arrange your (blocks)-code underneath each other, indicating that it is not a "space" that separates them; other than that, here's what really matters:
Proper CSS & HTML
make sure all your HTML tags are "paired" correctly -that each open-tag has a close-tag. This does not count for "void-tags" like <img /> or <input /> as these are "self closing".
if you need blocks placed next to each other, use <div> tags styled with CSS to be display:inline-block. You can also make use of "table-cells" -which do NOT have to be <td> tags as you can achieve this also with CSS to be styled as: display:table-cell.
You can also have elements be wrapped and packed tightly together (as mentioned above) by specifying their style as: float:left (or "right").
It is good practice to place your styles in CSS style-sheets -not in-line as the latter makes your code unmanageable; however, some style-sheets are persistent (see below) and the only way to override such styles is by using inline style.
If you're coding in someone else's code-base and none of the above works, you can make some style-sheets of your own that overrides the others with the word: !important after each property. You can use this to override any property but in this case it would typically be margin or border-...
Lastly, make sure there are no no-braking-spaces between your elements if they are not needed; these look like this:
If you need more info on how to write the modern HTML5 markup and CSS3 style-sheet language, the "Mozilla Developer Network" is a great reference: https://developer.mozilla.org

So let's attempt to solve this issue: "The point here is to stop the indentation to create gaps between my divs." What I can deduce from that sentence + the [post] page + its linked answer page is that client-side HTML minification, isn't the correct solution for this problem.
Have you looked into using inline-block or CSS resets first, before attempting to minify the HTML code or munge it by adding blank comments between the HTML tags?
The linked answer page discusses using inline-block to eliminate the spacing, which is occurring between your HTML elements. Those two pages also discuss resetting the font styles to fix the spacing issues.
CSS Resets can be used to fix gaps between elements. There is a list of the most popular CSS Resets at http://cssreset.com If needed, it should be easy to extend them to override any font settings, thus normalizing how the fonts are treating the white-space characters.
So empty comments shouldn't need to be injected between HTML tags, to fix spacing issues with whitespace characters. If CSS is used to fix the styles, then the HTML will be readable. If the HTML is minified, it will be harder to read & debug. I'd suggest not minifying your HTML using JavaScript. Rather try fixing the spacing issues with CSS.
(As for how minification works under it's hood... see my answer at this SO question.)

Minify HTML in the browser with vanilla JS.
const minify_html = (dom_node) => {
dom_node.childNodes.forEach(node => {
const isTextNode = node.nodeType === 3;
const isEmpty = node.nodeValue.trim().length === 0;
if (isTextNode && isEmpty){
dom_node.removeChild(node);
}
});
};
I created an example with 1,000 elements, and my computer can minify the html in less than 15ms, but it may be slower or faster depending on the device running the code.
https://jsfiddle.net/shwajyxr/

Related

Extract CSS rules from any given element

I'm trying to find a way to extract all css rules from any given element (I have full access to the html, and css).
I have look into other solutions such as getComputedStyle, however, it doesn't help much with certain properties such as width or height. For example, I expect it to return width: 100% when applicable, but it always return the real width value in px. What I need is the CSS rule definition, not how it is actually rendered on the browser.
My last resort is to use some css-inliner such as juicejs then I can access the element.style.prop, but I think if these js inliners can turn css rules to inline css then they must have extracted the css rules along the way already? I tried to look into its source but if there is any module out there doing the job it would be much better than trying to extract the code from that library.

It amazes me that there are not many solutions available for this issue. I ended up finding 2 solutions that both work (there are probably some edge cases but I have not encountered yet)
Option 1: A getMatchedCSSRules implementation posted here:
https://stackoverflow.com/a/37958301/821517
Pros: short and concise
Cons: does not support pseudo selectors (yet)
Option 2: A very all library called CSSUtilities mentioned here:
https://stackoverflow.com/a/12023174/821517
Pros: it can handle pseudo selectors
Cons: very very old library which relies on another library that is deprecated.
I ended up using CSSUtilities and I had to make some changes (hacks) to make it work with the new js engines. I post both modified files here in hope that it will help others (and that errors I made can be spotted and suggested with fixes)
New files: https://gist.github.com/yellow1912/c9dbbab97497ec42489be55e8abe73c7
Please ensure that you visit this link to download the package which contains the document file: http://www.brothercake.com/site/resources/scripts/cssutilities/

Can Tinymce give me some exact HTML content with all styles kept (really means WYSIWYG)?

It's really hard to understand how Tinymce can be considered as WYSIWYG, because I cannot get what I see (visually exactly). So it is more likely "what you see is just what you see".
Currently I use getContent() to get the HTML. But it lacks embedded style and if we show that output html in some container, the visual rendering will look different.
I've tried implementing my own solution to help embed the current style (based on getComputedStyle) to each element. But that's not very efficient (many redundant styles can be included) and not always works (such as for embedded video, I'm not so sure why the <video> is not kept with getContent() and all <video>s disappear in the final output html).
The Tinymce team has done a lot of works, but really not sure why they did not even think about this feature? We need the exact HTML that renders what you see in the editor. We can sanitize the HTML after that by ourselves.
Here is a demo helping you imagine better what's so bothersome with this WYSISWYG editor:
https://jsfiddle.net/L83u5v0n/1/
Clicking on the Show HTML button shows this:
So you can clearly see it's just more likely to be WYSIWYS rather than WYSIWYG. Is there a solution to get the exact output HTML based on some hidden feature of Tinymce that I've not known of? If it's based on some custom script using getComputedStyle then really I do not need it (actually my solution is fairly good).

This is a function of demos that are set up to look good in the editor versus real world usage. The intention of the content_css configuration is to provide the CSS that will be used to render the content.
If you apply the content CSS elements to the page then "Show HTML" works perfectly.
https://jsfiddle.net/xzh8utbp/
Alternatively, delete the content_css configuration (but that won't quite work in your example because JSFiddle adds CSS to the result window).
Note that I've added mce-content-body to the view div because it turns out our codepen demo CSS leverages it. Normally that wouldn't be required, but then I don't think normal integrations use our codepen CSS.

Block specific javascript statements from executing "globally" from a client side perspective

Dear Stackoverflow Community,
as you might see, this is my first post and a rather specific question I believe.
Here is the problem:
It is possible to block javascript as a whole or specific scripts as far as i could find out. What however if i globally want to stop the execution of specific javascript statements?
Practical example:
A website is utilising several javascripts many of which are useful and I would like to participate in their functionality with the permanent exclusion of any code that references overflow:hidden. I perceive that (CSS-Snippet?) to be malicious code by design. It can be easily circumvented and fixed through executing your own code. That's not what I'm talking about though.
Probable solutions:
- Remove the browsers capability of understanding that particular code
- Enforce overflow:auto
- Apply overflow:auto when the website is fully loaded automatically
The aforementioned solutions seem very unelegant to me and as you guys seem like a clever bunch maybe you can think of something less superfluous and practical.
All the goto addons I've tried only offer 1-off solutions or the need to repeat the task of counteracting those code snippets.
Current solution:
var r="html,body{overflow:auto !important;}";
var s=document.createElement("style");
s.type="text/css";
s.appendChild(document.createTextNode(r));
document.body.appendChild(s);
void 0;
Isn't there a way to tell FireFox (or chrome) to categorically ignore every single attempt to alter overflow:auto or similarly (perceived) malicious codes?

If you really want to set a property for all the elements in the website you could try this jQuery code
$('*').attr('style', ($('*').attr('style') || '') + ";color:black !important");
Here I get all the elements in the page and add my custom property in their style. In my case it was a black color, but you can have your overflow set to auto.
Check this Fiddle and you'll see that I set many colors to the texts in the page, both with inline styles and through CSS and then the jQuery script forces all of them to be black.
The explanation is that an inline rule is more "powerful" than CSS rules, so if you set an inline rule to be important it is applied to the element instead of the CSS rule (as it overwrites it).

Replace or even better remove inline styles with Javascript or jQuery?

Before I posted this question I searched Google (and Stackoverflow) and though there are quite some results for this, I simply don't understand most offered solutions.
Problem I am experiencing is that I use a script which fetches RSS feeds from our main website. This works perfectly, however it also displays the used inline styles, which are being used sometimes. Ofcourse this messes up the way things looks and looks rather, lets say, unprofessional.
I checked the source of what's being loaded and as far as I can tell, the main culprit is an inline style called:
<span style="font-family: verdana,geneva;">text</span>
Less frequent are the following ones (but still rather see them go as well):
<em>text</em>
<strong>text</strong>
<em class="moz-txt-slash">text</em>
<span class="moz-txt-tag">text</span>
Can these all be removed with jQuery or Javascript? Apparently it's possible, but I don't know how. And should I put everything in a seperate div-container?
I can live with the unnecesarry 'p's and 'br's, but rather see the other ones removed.
Anyone out there who is willing to help me with this? My gratitude!
//edit
Thank you all for the quick responses... Highly appreciated.
I use a script called MagicParser to fetch those RSS feeds. I don't know much about coding like PHP, jQuery and Javascript, but I will try to use the solutions. I hope it will work. The first one didn't though :/

You can easily target all elements that have inline styles with $("[style]") and remove the styles with .removeAttr("style"):
$("[style]").removeAttr("style");
If you have a DOM node or jQuery collection and want to remove styles from its descendants, simply use .find("[style]").removeAttr("style") on it instead.
Classes are not the same as inline styles, but you can also remove those with .removeClass().

You can use jquery:
$("#myID").attr("style","[Nothing here, or eventually styles to override]");
More info there:
http://api.jquery.com/attr/

Generating/selecting non-standard HTML tags with jQuery, a good idea?

I've noticed that jQuery can create, and access non-existent/non-standard HTML tags. For example,
$('body').append('<fake></fake>').html('blah');
var foo = $('fake').html(); // foo === 'blah'
Will this break in some kind of validation? Is it a bad idea, or are there times this is useful? The main question is, although it can be done, should it be done?
Thanks in advance!

You can use non-standard HTML tags and most of the browsers should work fine, that's why you can use HTML5 tags in browsers that don't recognize them and all you need to do is tell them how to style them (particularly which tags are display: block). But I wouldn't recommend doing it for two reasons: first it breaks validation, and second you may use some tag that will later get added to HTML and suddenly your page stops working in newer browsers.

The biggest issue I see with this is that if you create a tag that's useful to you, who's to say it won't someday become standard? If that happens it may end up playing a role or get styles that you don't anticipate, breaking your code.

The rules of HTML do say that if manipulated through script the result should be valid both before and after the manipulation.
Validation is a means to an end, so if it works for you in some way, then I wouldn't worry too much about it. That said, I wouldn't do it to "sneak" past validation while using something like facebook's <fb:fan /> element - I'd just suck it up and admit the code wasn't valid.

HTML as such allows you to use any markup you like. Browsers may react differently to unknown tags (and don't they to known ones, too?), but the general bottom line is that they ignore unknown tags and try to render their contents instead.
So technically, nothing is stopping you from using <fake> elements (compare what IE7 would do with an HTML5 page and the new tags defined there). HTML standardization has always been an after-the-fact process. Browser vendors invented tags and at some point the line was drawn and it was called HTMLx.
The real question is, if you positively must do it. And if you care whether the W3C validator likes your document or not. Or if you care whether your fellow programmers like your document or not.
If you can do the same and stay within the standard, it's not worth the hassle.

There's really no reason to do something like this. The better way is to use classes like
<p class = "my_class">
And then do something like
$('p.my_class').html('bah');
Edit:
The main reason that it's bad to use fake tags is because it makes your HTML invalid and could screw up the rendering of your page on certain browsers since they don't know how to treat the tag you've created (though most would treat it as some kind of DIV).
That's the main reason this isn't good, it just breaks standards and leads to confusing code that is difficult to maintain because you have to explain what your custom tags are for.
If you were really determined to use custom tags, you could make your web page a valid XML file and then use XSLT to transform the XML into valid HTML. But in this case, I'd just stick with classes.

We Keep Coding

JavaScript is the programming language of the Web.