How should i properly encode a string containing HTML (in javascript) - javascript

My code is having XSS Scripting vulnerability and i want to prevent that. I am using a Jquery Encoder to prevent XSS Scripting .
Following is the JS Code.
function test(response)
{
$('#test').html( $.encoder.encodeForHTML(response));
}
The Contents of the response is this
response="<html><body><table><tr><td>DATA1</td><td>DATA2</td></tr></table></body></html>"
so when i do $.encoder.encodeForHTML(), i expected the output will be a Table (Before using the encoder i was properly getting a table format), but now, the response string itself is getting printed inside the div element.
Can Someone please help me with this.. I need to encode the response (which contains a HTML Code) as such, so that i get the Output as a proper table format.
Am i going wrong somewhere or am i using a wrong function to encode? Suggestions Welcome.

Just use var encodedData = encodeURI(htmlDataToEncode); before you pass it as a response.

Related

What is the right way to safely and accurately insert user-provided URL data into an HTML5 document?

Given an arbitrary customer input in a web form for a URL, I want to generate a new HTML document containing that URL within an href. My question is how am I supposed to protect that URL within my HTML.
What should be rendered into the HTML for the following URLs that are entered by an unknown end user:
http://example.com/?file=some_19%affordable.txt
http://example.com/url?source=web&last="f o o"&bar=<
https://www.google.com/url?source=web&sqi=2&url=https%3A%2F%2Ftwitter.com%2F%3Flang%3Den&last=%22foo%22
If we assume that the URLs are already uri-encoded, which I think is reasonable if they are copying it from a URL bar, then simply passing it to attr() produces a valid URL and document that passes the Nu HTML checker at validator.w3.org/nu.
To see it in action, we set up a JS fiddle at https://jsfiddle.net/kamelkev/w8ygpcsz/2/ where replacing the URLs in there with the examples above can show what is happening.
For future reference, this consists of an HTML snippet
<a>My Link</a>
and this JS:
$(document).ready(function() {
$('a').attr('href', 'http://example.com/request.html?data=>');
$('a').attr('href2', 'http://example.com/request.html?data=<');
alert($('a').get(0).outerHTML);
});
So with URL 1, it is not possible to tell if it is URI encoded or not by looking at it mechanically. You can surmise based on your human knowledge that it is not, and is referring to a file named some_19%affordable.txt. When run through the fiddle, it produces
My Link
Which passes the HTML5 validator no problem. It likely is not what the user intended though.
The second URL is clearly not URI encoded. The question becomes what is the right thing to put into the HTML to prevent HTML parsing problems.
Running it thru the fiddle, Safari 10 produces this:
My Link
and pretty much every other browser produces this:
My Link
Neither of these passes the validator. Three complaints are possible: the literal double quote (from un-escaping HTML), the spaces, or the trailing < character (also from un-escaping HTML). It just shows you the first of these it finds. This is clearly not valid HTML.
Two ways to try to fix this are a) html-escape the URL before giving it to attr(). This however results in every & becoming & and the entities such as & and < become double-escaped by attr(), and the URL in the document is entirely inaccurate. It looks like this:
My Link
The other is to URI-encode it before passing to attr(), which does result in a proper validating URL which actually clicks to the intended destination. It looks like this:
My Link
Finally, for the third URL, which is properly URI encoded, the proper HTML that validates does come out.
My Link
and it does what the user would expect to happen when clicked.
Based on this, the algorithm should be:
if url is encoded then
pass as-is to attr()
else
pass encodeURI(url) to attr()
however, the "is encoded" test seems to be impossible to detect in the affirmative based on these two prior discussions (indeed, see example URL 1):
How to find out if string has already been URL encoded?
How to know if a URL is decoded/encoded?
If we bypass the attr() method and forcibly insert the HTML-escaped version of example URL 2 into the document structure, it would look like this:
My Link
Which seemingly looks like valid HTML, yet fails the HTML5 validator because it unescapes to have invalid URL characters. The browsers, however, don't seem to mind it. Unfortunately, if you do any other manipulation of the object, the browser will re-escape all the &'s anyway.
As you can see, this is all very confusing. This is the first time we're using the browser itself to generate the HTML, and we are not sure if we are getting it right. Previously, we did it server side using templates, and only did the HTML-escape filter.
What is the right way to safely and accurately insert user-provided
URL data into an HTML5 document (using JavaScript)?
If you can assume the URL is either encoded or not encoded, you may be able to get away with something along the lines of this. Try to decode the URL, treat an error as the URL not being encoded and you should be left with a decoded URL.
<script>
var inputurl = 'http://example.com/?file=some_19%affordable.txt';
var myurl;
try {
myurl = decodeURI(inputurl);
}
catch(error) {
myurl = inputurl;
}
console.log(myurl);
</script>

Unable to remove html tags from response JSON

Hi I am new to AngularJS. I am having a problem parsing JSON data to proper format. Actually the JSON response itself returned HTML format data (it contains HTML tags like &lt,;BR,> etc). If I check the response in browser it returns fine, but in device(TAB,MOBILE) the HTML tags are also getting appended. I am using AngularJS to bind the JSON response to DOM. Is there any way to simply ignore HTML tags in JQuery or in AngularJs? At the same time I don't want to remove the HTML tags as they are necessary to define "new line", "space", "table tag" etc.
A sample response I am getting is like:
A heavier weight, stretchy, wrinkle resistant fabric.<BR><BR>Fabric Content:<BR>100% Polyester<BR><BR>Wash Care:<BR>
If I apply the binding using {{pdp.desc}}, the HTML tags are also getting added. Is there any way to accomplish this?
I have added ng-bind-html-unsafe="pdp.desc", but still "BR" tags r coming.
useless html tags can be remove using regix expression, try this
str.replace(/<\/?[^>]+>/gi, '')
Try to use three pairs of brackets {{{pdp.desc}}} In Handlebars it works, possible in your case to.
Use JS HTML parser
var pattern = #"<(img|a)[^>]*>(?<content>[^<]*)<";
var regex = new Regex(pattern);
var m = regex.Match(sSummary);
if ( m.Success ) {
sResult = m.Groups["content"].Value;
courtesy stackoverflow.

convert string to json and than read the json value

First of all, I'm fairly new to json, so please forgive me if I've made a terrible mistake. I've got some code that gets a json object from a website using YQL It returns it as a string. So now I want to parse this into a json object and than read it.
This is my code:
$.getJSON("http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url=%22http://iphone-api.uitzendinggemist.nl/v1/episodes.json%22%20and%20xpath=%27*%27&format=json", function(data) {
console.log(data);
content = data.query.results.html.body.p;
json = JSON.stringify(eval("(" + content + ")"));
str = json.revoked;
$('#table').append('<li>' + str + '</li>');
});
JS fiddle
I just can't figure out why this gives me undifined, instead of the value it should give.
So now my question was if someone here knows why it isn't working properly.
The json variable is an array, you need to access an index.
string = json[0].revoked;
You have many many many errors in your code. You should try to understand each step that you are doing, it looks like you don't. Here's a fork of your code that does something, I'm not sure what you want it to do. I'll tell you few things you did wrong:
Use var keyword when declaring new variables within functions
Don't parse JSON using eval(), but use some parser. E.g. $.parseJSON(). Using eval() is a security risk, as returned script WILL be executed on client and you should only be interested in getting data.
When constructing HTML, take care to encode text that you want displayed. In your case, don't concatenate strings ('<li>' + str + '</li>'). You can use jQuery ($('<li>').text(str)).
Don't add li elements to a table element. Either add them to ul or ol elements, or in case of tables create rows and cells.
It is completely unclear why you would eval, and them stringify an object. You end up with same exact data.

serialize javascript code inside html tags

I have an ajax function that responds with contents of an html tag, as a json string that I will eval to make an object out of it.
Now as long as the json string has no javascript inside it, it works fine. But when I include some simple javascript inside the string, I get Uncaught SyntaxError: Unexpected identifier when I try to eval the string, although I escape every ' and ".
Here is the sample: http://jsfiddle.net/dCtA3/2/
When I remove the javascript: http://jsfiddle.net/dCtA3/3/ it works.
Try it using proper JSON (if your XHR returns JSON, use that, parse it, or check if the browser didn't parse it already for you (should be [response].innerhtml)). Change your jsfiddle code to:
var contents = JSON.stringify(
{innerhtml : "<p onclick=\"javascript:alert('hello world');\"> click me! </p>"}
);
function insertContent() {
document.getElementById("container")
.innerHTML = JSON.parse(contents).innerhtml;
}​
[Edit based on comment]
see this jsfiddle where a json request is simulated on button click. The stringified response object is shown after clicking the button.
Do you need your javascript code to be generated and does it HAVE TO be in that string? If the javascript can be in your js file you can simply do this:
make a function helloWorld that does the alert, and in your json string with html hance onclick="helloworld()".

strange characters (amp;) added to moss service output

I have moss service which output the url of image.
Lets say the output url has '&' character , the service appending amp; next to &.
for ex: Directory.aspx?&z=BWxNK
Here amp; is additionally added. it is a moss sevice. so i don't have control on the sevice.
what i can do is decode the output. As i am using Ajax calls for calling moss sevice i am forced to decode the out put from javascript. i tried decodeURIComponent,decodeURI,unescape. nothing solved the problem.
Any help greatly appreciated. even server side function also helpful. i am using Aspl.net MVC3
Regards,
Kumar.
& is not URI encoded, it's HTML encoded.
For a server side solution, you could do this:
Server.HtmlDecode("&") // yields "&"
For a JavaScript solution, you could set the html to "&" and read out the text, to simulate HTML decoding. In jQuery, it could look like this:
$("<span/>").html("&").text(); // yields "&"
& is SGML/XML/HTML for &.
If the service is outputting an XML document, then make sure you are using an XML parser to parse it (and not regular expressions or something equally crazy).
Otherwise, you need decode the (presumably) HTML. In JavaScript, the easiest way to do that is:
var foo = document.createElement('div');
foo.innerHTML = myString;
var url = foo.firstChild.data;

Categories