unescape in javascript not working when %26 ( & sign) is in value - javascript

I have the below code in my JSP. UI displays every character correctly other than "&".
<c:out value="<script>var escapedData=unescape('${column}');
$('div').html(escapedData);</script>" escapeXml="false" /> </div>
E.g. 1) working case
input = ni!er#
Value in my escapedData variable is ni%21er%40. Now when I put it in my div using
$('div').html(escapedData); then o/p on html is as expected
E.g. 2) Issue case
input = nice&
Value in my escapedData variable is nice%26. Now when I put it in my div using
$('div').html(escapedData); then also it displays below
$('#test20').html('nice%26');
However, when output is displayed in JSP, it just prints "nice". It truncates everything after &.
Any suggestions?

It looks like you have some misunderstandings what unescape(val)/escape(val) do and where you need them. And what you need to take attention of when you use .html().
HTML and URI have certain character that have special meanings. The most important ones are:
HTML: <, >, &
URI: /,?,%,&
If you want to use one of those characters in HTML or URI you need to escape them.
The escaping for URI and for HTML are different.
The functions unescape/escape (deprecated) and decodeURI/endcodeURI are for URI. But was you want is to escape your data into the HTML format.
There is no build-in function in_JS_ that does this but you could e.g. use the code of the answer to this question Can I escape html special chars in javascript?.
But as it seems that you use jQuery you could think of just using .text instead of .html as this will do the escaping for you.
An additional note:
I'm pretty sure that the var escapedData=unescape('${column}'); does not do anything. I assume that ${column} already is ni!er#/nice&.
So please check your source code. If var escapedData=unescape('${column}'); will look like var escapedData=unescape('ni!er#'); then you should remove the unescape otherwise you would not get the expected result if the ${column} contains something like e.g. %23.

Related

Escaping string using JavaScript before populating form input

I have this code where I grab an attribute value and load it into a form, the headline line can look something like:
Welcome to America's best valued whatever
But when using this escape function, the string is cut off at the apostrophe,
var headline = escape($(this).attr("data-headline"));
//populate the textbox
$(e.currentTarget).find('input[name="headline"]').val(headline);
I've also tried using the solutions here: HtmlSpecialChars equivalent in Javascript? with no luck.
How can I populate my input and keep apostrophe's/quotes?
Just use
$(this).find('input[name="headline"]').val(this.dataset.headline);
No need for any escaping.
However, notice that escape does not cut off apostrophes, it replaces them with %27. If your current code does not work with apostrophes in the headline, make sure that the markup containing the data-headline attribute is properly escaped by whatever tool is creating it.
var headline = $(this).attr("data-headline").replace(/'/g, '%27');
//populate the textbox
$(e.currentTarget).find('input[name="headline"]').val(unescape(headline));
If browser compatibility is important, dataset is only available IE11+ https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/dataset#Browser_compatibility

JavaScript:output symbols and special characters

I am trying to include some symbols into a div using JavaScript.
It should look like this:
x ∈ &reals;
, but all I get is: x ∈ &reals;.
var div=document.getElementById("text");
var textnode = document.createTextNode("x ∈ &reals;");
div.appendChild(textnode);
<div id="text"></div>
I had tried document.getElementById("something").innerHTML="x ∈ &reals;" and it worked, so I have no clue why createTextNode method did not.
What should I do in order to output the right thing?
You are including HTML escapes ("entities") in what needs to be text. According to the docs for createTextNode:
data is a string containing the data to be put in the text node
That's it. It's the data to be put in the text node. The DOM spec is just as clear:
Creates a Text node given the specified string.
You want to include Unicode in this string. To include Unicode in a JavaScript string, use Unicode escapes, in the format \uXXXX.
var textnode = document.createTextNode("x \u2208 \u211D");
Or, you could simply include the actual Unicode character and avoid all the trouble:
var textnode = document.createTextNode("x ∈ ℝ");
In this case, just make sure that the JS file is served as UTF-8, you are saving the file as UTF-8, etc.
The reason that setting .innerHTML works with HTML entities is that it sets the content as HTML, meaning it interprets it as HTML, in all regards, including markup, special entities, etc. It may be easier to understand this if you consider the difference between the following:
document.createTextNode("<div>foo</div>");
document.createElement("div").textContent = "<div>foo</div";
document.createElement("div").innerHTML = "<div>foo</div>";
The first creates a text node with the literal characters "<div>foo</div>". The second sets the content of the new element literally to "<div>foo</div>". The third, on the other hand, creates an actual div element inside the new element containing the text "foo".
Every character has a hexadecimal name (for example 0211D). if you want to transform it into a HTML entity, add &#x => ℝ or use the entity name &reals; or the decimal name ℝ which can be found all here: http://www.w3schools.com/charsets/ref_html_entities_4.asp
But when you use JavaScript, in order to make the browser understand that you want to output a unicode symbol and not a string, escape entities are required. To do that, add \u before the hexadecimal name =>\u211D;.
document.createTextNode will automatically html-escape the needed characters. You have to provide those texts as JavaScript strings, either escaped or not:
document.body.appendChild(document.createTextNode("x ∈ ℝ"));
document.body.appendChild(document.createElement("br"));
document.body.appendChild(document.createTextNode("x \u2208 \u211d"));
EDIT: It's not true that the createTextNode function will do actual html escaping here as it doesn't need to. #deceze gave a very good explanation about the connection between the dom and html: html is a textual representation of the dom, thus you don't need any html-related escaping when directly manipulating the dom.

How to use Javascript to add HTML code without it converting escaped characters?

I am using AJAX to handle a form submission. The AJAX request returns a javascript script with text string arguments. I run into a problem when I try to add the AJAX returned script to the existing page.
Here are the different things I've tried to accomplish this already:
newAjaxBlock.appendChild(document.createTextNode(ajaxRequest.responseText));
newAjaxBlock.innerHTML = ajaxRequest.responseText;
newAjaxBlock.textContent = ajaxRequest.responseText;
The problem is that if I use .innerHTML to insert the returned script, it converts the escaped characters in the argument text string to their HTML equivalent and the script will throw errors because of single quotes and other characters in the string.
I expected .innerHTML to take the text and write it exactly as PHP provides it without unexpected conversions from escaped characters to their HTML equivalents.
For example I would generate a script in PHP and run it through htmlspecialchars() and make a text string exactly as follows:
<script type='text/javascript' id='layerScript'>Lib.alertFunction(arg1, $arg2, '<p>You changed THING from "value1" to "newValue".</p>');</script>
But instead .innerHTML converts it to this:
<script type='text/javascript' id='layerScript'>Lib.alertFunction(arg1, $arg2, '<p>You changed THING from "value1" to "newValue".</p>');</script>
and as you can see, the script won't work with single quotes and other characters messing up the argument list.
In contrast, when I tried using the createTextNode or .textContent options it creates a text node that ignores the HTML tags and shows it ALL as text instead of interpreting the HTML. This is not a surprise to me but leaves me with no option that actually just puts the HTML code in as it's written without converting the escaped characters.
All of the code works exactly as I expect and need it to except when the script argument contains single quotes or lt and gt symbols so I know I have narrowed the problem down to this single issue. I don't want jquery suggestions and I know I could code for an extra few days to make a function that does what I need but I want to know if there's something that does what .innerHTML does without converting escaped characters before I waste that time.
This exact question was already asked and was answered with "use .textContent" which as I mentioned doesn't work to insert formatted HTML with AJAX.

jQuery URI encode (char &) .html() attribute value

I've read a lot of the HTML encoding post for the last day to solve this. I just managed to locate it.
Basicly I have set an attribute on an embed tag with jQuery. It all works fine in the browser.
No I want to read the HTML itself to add the result as a value for an input field to let the user copy & past it.
The PROBLEM is that the .html() function (also plain JS .innerHTML) converts the '&' char into '& amp;' (without the space). Using differen html encoder functions doesnt make a difference. I need the '&' char in the embed code.
Here is the code:
HTML:
<div id="preview_small">
<object><embed src="main.swf?XY=xyz&YXX=xyzz"></embed>
</object></div>
jQuery:
$("#preview_small object").clone().html();
returns
... src=main.swf?XY=xyz&YXX=xyzz ...
When I use:
$("#preview_small object").clone().children("embed").attr("src");
returns
main.swf?XY=xyz&YXX=xyzz
Any ideas how I can get the '&' char direct, without using regex after I got the string with .html()
I need the & char in the embed code.
No you don't. This:
<embed src="xyz&YXX=xyz"></embed>
is invalid HTML. It'll work in browsers since they try to fix up mistakes like this, but only as long as the string YXX doesn't happen to match an HTML entity name. You don't want to rely on that.
This:
<embed src="xyz&YXX=xyz"></embed>
is correct, works everywhere, and is the version you should be telling your users to copy and paste.
attr("src") returns xyz&YXX=xyz
Yes, that's the underlying value of that attribute. Attribute values and text content can contain almost any character directly. It's only the HTML serialisation of them where they have to be encoded:
<div title="a<b"&c>d">
$('div').attr('title') -> a<b"&c>d
I want to read the HTML itself to add the result as a value for an input field
<textarea id="foo"></textarea>
$('#foo').val($('#preview_small object').html());
However note that the serialised output of innerHTML/html() is not in any particular fixed dialect of HTML, and in particular IE may give you code that, though generally understandable by browsers, is also not technically valid:
$('#somediv').html('<div title="a/b"></div>');
$('#somediv').html() -> '<DIV title=a/b></DIV>' - missing quotes
So if you know the particular format of HTML you want to present to the user, you may be better off generating it yourself:
function encodeHTML(s) {
return s.replace(/&/g, '&').replace(/</g, '<').replace(/"/g, '"');
}
var src= 'XY=xyz&YXX=xyzz';
$('#foo').val('<embed src="'+encodeHTML(src)+'"><\/embed>');
(The \/ in the close tag is just so that doesn't get mistaken as the end of a <script> block, in case you're in one.)

IE innerHTML chops sentence if the last word contains '&' (ampersand)

I am trying to populate a DOM element with ID 'myElement'. The content which I'm populating is a mix of text and HTML elements.
Assume following is the content I wish to populate in my DOM element.
var x = "<b>Success</b> is a matter of hard work &luck";
I tried using innerHTML as follows,
document.getElementById("myElement").innerHTML=x;
This resulted in chopping off of the last word in my sentence.
Apparently, the problem is due to the '&' character present in the last word. I played around with the '&' and innerHTML and following are my observations.
If the last word of the content is less than 10 characters and if it has a '&' character present in it, innerHTML chops off the sentence at '&'.
This problem does not happen in firefox.
If I use innerText the last word is in tact but then all the HTML tags which are part of the content becomes plain text.
I tried populating through jQuery's #html method,
$("#myElement").html(x);
This approach solves the problem in IE but not in chrome.
How can I insert a HTML content with a last word containing '&' without it being chopped off in all browsers?
Update : 1. I tried html encoding the content which I am trying to insert into the DOM. When I encode the content, the html tags which are part of the content becomes plain string.
For the above mentioned content, I expect the result to be rendered as,
Success is a matter of hard work &luck
but when I encode what I actually get in the rendered page is,
<b>Success</b> is a matter of hard work &luck
You should replace your & with &.
The & (ampersand) character is used within HTML to represent various special characters. For example, " = ", < = <, etcetera. Now, &luck clearly is not a valid HTML entity (for one it is missing the semicolon). However, various browsers may, due to combinations of error correcting (the semicolon), and the fact that it looks somewhat like an HTML entity (& followed by four characters) try to parse it as such.
Because &luck; is not a valid HTML entity, the original text is lost. Because of this, when using an ampersand in your HTML, always use &.
Update: When this text is entered by a user, it is up to you to escape this character properly. In PHP for example, you would call htmlentities on the text before displaying it to the user. This has the added benefit of filtering out malicious user code such as <script> tags.
The ampersand is a special character in HTML that indicates the start of a character entity reference or numeric character reference, you need to escape it like so:
var x = "<b>Success</b> is a matter of hard work &luck";
Try using this instead:
var x = "<b>Success</b> is a matter of hard work &luck";
By HTML encoding the ampersand, you are ensuring that there is no ambiguity in what you mean when you write "&luck".

Categories