Using special symbols in HTML and Javascript - javascript

I'm trying to figure out what is the proper way to use the "not" operator (Negation) which looks like: ¬ (UTF-8: U+00AC) in JavaScript. I could use:
<span>¬A</span>
But I'm not sure if it is the proper way and if its been supported on all (or most) of the modern browsers and mobile. I tried to find a previous topic on this matter but could not find any. The wanted result is to display this operator beside A.

Correct escape sequences for HTML are: &#x00AC, followed by ';A', and for javascript - either '\xACA' or '\u00ACA'.
"meta charset" is always advisable to include, but it does not affect the escapes.
I added ';' after the HTML escape sequence: for some reason it wasn't needed if instead of 'A' there is, for example, 'k', I didn't know. Semicolon has to be added.

If you are properly using UTF-8 (which is to say that your HTML, JS, and CSS files are all encoded in UTF-8, <meta charset="utf-8"> is present in the <head>, and HTTP headers all define the charset as UTF-8), there should be no need to encode the symbol at all. Simply write ¬, exactly as you have done here.
The only symbols you need to encode are < and & (also ' and/or " in attributes). Most would also recommend encoding > and the non-breaking space, to avoid confusion. I encode all of these, but nothing else.

Related

Javascript implementation of anti-XSS escaping functions

The OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet lists rules to prevent XSS attacks by escaping data appropriately, and it contains links to reference implementations of these escaping methods in the Java language (HTML Escape, Attribute Escape, Javsacript Escape, CSS Escape, URL Escape).
Is there an implementation anywhere of these in Javascript, or do I have to 'roll my own'?
UPDATE: I mean Javascript running in the browser. For example, for escaping text rendered with the jQuery html() method (though of course text() is safer), or escaping data rendered using a template engine such as EJS.
UPDATE2: ESAPI JavaScript seems to be what I was looking for, though it's still only "Alpha Quality"
Since you tend to work with the DOM in (client-side) JavaScript, there is no need for HTML and HTML attribute escaping. For example, given untrusted input input,
var el = document.createElement('div');
el.setAttribute('title', input);
el.appendChild(document.createTextNode(input));
is perfectly safe, since you are never constructing (serialized) HTML in the first place.
If you are writing custom JavaScript or CSS from JavaScript code, you are doing something wrong (including using document.write or some data URI script src abominations), so there is no escaping provided for either. You can simply write your code or styles beforehand and then call the appropriate functions or set the appropriate classes.
encodeURI and encodeURIComponent can be used to encode URIs or their components.
You can use js-xss library. For me it worked against test cases I've been using for injecting scripts into HTML.

angular fail using letters ø, æ, å, maybe something about i18n

i'm having some problem with my angular.
I can't make it use the letters "æ,ø,å"
I tried installing I18N, but doesn't seems to do anything.
when i navigate to a page url which contains one of the letters, it's convert it to something like this
Ops%EF%BF%BDtning
And it should be something like
Opsætning
And the same thing happen when i use it on the pages. like shown here
I been using angular a long time, but this problem have never happen for me before. can anyone share some wisdom. Im sure it's something stupid little mistake i just can't see my self.
anyway, thanks for your time,
Make sure to set the right charset, in your case utf-8
meta http-equiv='Content-Type' content='text/html; charset=ISO-8859-1' />
You might have to convert your files to use utf-8 (Notepad++ has a 'Convert to UTF-8' function)
Alternatively you could use HTML special character codes for: æ ('&aelig';), ø ('&oslash';), å ('&aring';)
You can read more about the use of special characters in HTML, XML and JS here

Output script tags without jQuery, avoiding execution

I have JS calling remote server through AJAX. The response contains something similar to this
<script>alert(document.getElementById('some_generated_id').innerHTML; ... </script>
User copies the response and uses for own purposes. Now I need to make sure that not a single browser runs the code when I do this:
var response = '<scrip.....';
document.getElementById('output_box').innerHTML = response;
Same should apply to any HTML tags. I know that .text() from jQuery will do exactly what I need:
var response = '<scrip.....';
$('#output_box').text(response);
I am looking for any solutions, including, but not limited to: escaping special characters, however displaying them correctly; adding zero-width space to tags (has to be efficient); outputting in parts. Has to be pure JS.
If you're using a server-side language there is probably a method to escape special characters.
In PHP you could use htmlspecialchars(), it will convert certain characters that have significance in HTML to HTML entities (i.e. & to &).
They will still display correctly and you'll be able to copy and paste the text, but the javascript shouldn't run.
If you need a pure javascript solution for this, someone has answered that here https://stackoverflow.com/a/4835406/15000

How to replace href url without having all & replaced by a &

I'm trying to replace a few empty href="" statements with computed URL's on pageload.
The replacement string, however, contains several &amp's, which need to be in the href.
What ever I try, as soon as setAttribute("href", url); or $(id).attr('href', url) is used, all &amp's are replaced by a & and the links fail.
How can I prevent this from happening? I'm pretty sure any code that ends in .href or contains href is going through a translation stage and will all fail.
You might have noticed that I've edited this message a lot of times because you simply cannot type a full '& a m p ;' in this box without it being translated to a '&' here too.
Let's step back a bit, and think about this from the ground up. The escape sequence &xxx; is the way to embed "special characters" into an HTML document. To be clear, it is an HTML escape sequence.
This way of escaping characters is useful because it means characters can be placed into an ASCII file that would otherwise be impossible to embed.
For example, the Euro sign (€) can be embedded using a numeric code representing its Unicode codepoint €. Or, a non-breaking space can be embedded using its character entity name, . (Aside: these characters would be easier to embed in a UTF-8 file, with no escaping needed, but that's beside the point here)
Now, because this escape sequence always begins with an ampersand, it means that a bare ampersand in an HTML document is ambiguous - as the browser reads through the document it doesn't know if you want an actual ampersand or whether there's an HTML entity escape sequence coming up. To remove this ambiguity, if you want to embed an ampersand in an HTML document then you need to escape it.
Let me highlight that sentence: To remove this ambiguity, if you want to embed an ampersand in an HTML document then you need to escape it.
Because this escaping only applies to HTML documents, if you are writing Javascript in a .js file then this escape sequence is irrelevant to you. You can just drop your ampersands in unescaped and there's no problem. What it means is, your Javascript files can replace the URLs with bare ampersands, no & needed.
As far as the browser is concerned, all of those URLs you write only contain bare ampersands. The escape sequence & is only needed to express those in HTML form, as soon as the browser has read the HTML then that escape sequence is forgotten about.
I've repeated myself a few times here but that's because I've tried to make it as clear as possible.
You have to replace & with & in the url, prior to setting href property of an a tag.
var url = 'www.example.com?a=1&b=2'
url = url.replace(/\&/g, '&')
$('a').attr('href', url)

Change encoding from UTF-8 to ISO-8859-2 in Javascript

I would like to change string encoding from UTF-8 to ISO-8859-2 in Javascript. How can I do it?
I need it because I've designed a widget. User just copies < script > tag from my site and puts it on his. This script creates div and puts into div widget contents with text.
If target website is in UTF-8 encoding - it works fine. But when it is in ISO-8859-2 than text that is encoded in UTF-8 is displayed on site with ISO-8859-2 and as a result I see trash.
Instead of using e.g. "ĉ" in your JavaScript code, use Unicode escapes such as "\u0109".
If you're in control of the output, you can replace all special characters with unicode escapes (e.g. \u00e4 for ä). The browser can interpret it correctly regardless of document encoding.
The easiest way to do this would be to put the string into a JSON encoder. Both PHP's and Ruby's does that. Don't know about other implementations though.
Another solution that might work is to add charset="utf-8" to the <script> tag.
I suppose you just need to convert your wdiget from UTF-8 to ISO-8859-2 and provide 2 versions of script.

Categories