What is dangerous in an HTML textarea?

What is dangerous in an HTML textarea? - javascript

I'm working on the profile section of my users.
They can define several things including their description ("About me") in a textarea, with a max of 400 characters.
In this description, I want to let my users use Font Awesome and Bootstrap icons. I also let them use JS tags (but not PHP ones). I guess this is pretty dangerous, therefore I wanted to know :
Is letting people use JS tags dangerous ? I know I must block functions like $.ajax but maybe there are somethings else.
Does a function which blocks string containing JS or PHP code exist in JS or jQuery ?
Is letting people use HTML tags and attributes dangerous for my site ?
Thank you !

As long as you escape all the tags before saving the form, I think it's all good.
You can do this with the following function:
function escapeTags(value){
return $('<div/>').text(value).html();
}
For eg. the following <script>alert("hello world")</script> will become <script>alert("hello world")</script>.
Also, you can do this with javascript only:
function htmlEntities(str) {
return String(str).replace(/&/g, '&').replace(/</g, '<').replace(/>/g, '>').replace(/"/g, '"');
}
Source: https://css-tricks.com/snippets/javascript/htmlentities-for-javascript/
...if you take a look at comments you'll se that there's also a function that reverse my escapeTags function
// Encode/decode htmlentities
function krEncodeEntities(s){
return $j("<div/>").text(s).html();
}
function krDencodeEntities(s){
return $j("<div/>").html(s).text();
}

Yes, it's in fact very dangerous, and should only be allowed on a very limited set of "tags" or function calls.
Several systems like bbcode exist to address specifically these issues. i suggest implementing one of those.
They're easier to validate, and it is fairly easy to add new features to them.
It should be less work than validating actual js and php code and figuring out whether or not it is trying to do something malicious.

Related

Using MVC resource in Jquery - function stops working

I have a Jquery function in MVC View that check if at least one checkbox is clicked. Function is working properly if I use hardcoded string. But when I add
#Resources.myString into, it stops working, I can't figure out why
$('.form-horizontal').on('submit', function (e) {
if ($("input[type=checkbox]:checked").length === 0) {
e.preventDefault();
alert("This is working");
alert(#Resources.myString); //with this the function is not working anymore
return false;
}
});
I need to add the the string for multilingual purpose.
I tried diferent aproches
alert(#Resources.myString);
alert(#Html.Raw(Resources.myString))
var aaa = { #Html.Raw(Resources.myString)} //and calling the aaa
I think I am missing some basic knowlage of how this should work together

During page rendering, #Resources.myString will be injected as is in the code. For instance, if myString == "abc";, you'll end up with alert(abc); which is not what you want.
Just try to enclose your string in quotes:
alert("#Resources.myString");
As an aside, putting Razor code in Javascript logic is usually considered bad practice, as it prevents you from putting Javascript code in separate files (and therefore caching), and makes the code less readable.
Take a look as this question and the provided answer which gives a simple way to deal with that.

As ASP.NET dynamically generates HTML, CSS, JS code, the best way to find the error is to read the generated sources (Ctrl + U in most modern browsers).
You will see that your code
alert(#Resources.myString);
produces
alert(yourStringContent);
and should result in a console error yourStringContent is not defined.
You need to use quotes as you are working with a JavaScript string:
alert('#Resources.myString');
It will produce a correct JavaScript code like:
alert('yourStringContent');

Techniques to avoid building HTML strings in JavaScript?

It is very often I come across a situation in which I want to modify, or even insert whole blocks of HTML into a page using JavaScript. Usually it also involves changing several parts of the HTML dynamically depending on certain parameters.
However, it can make for messy/unreadable code, and it just doesn't seem right to have these little snippets of HTML in my JavaScript code, dammit.
So, what are some of your techniques to avoid mixing HTML and JavaScript?

The Dojo toolkit has a quite useful system to deal with HTML fragments/templates. Let's say the HTML snippet mycode/mysnippet.tpl.html is something like the following
<div>
<span dojoAttachPoint="foo"></span>
</div>
Notice the dojoAttachPoint attribute. You can then make a widget mycode/mysnippet.js using the HTML snippet as its template:
dojo.declare("mycode.mysnippet", [dijit._Widget, dijit._Templated], {
templateString: dojo.cache("mycode", "mysnippet.tpl.html"),
construct: function(bar){
this.bar = bar;
},
buildRendering: function() {
this.inherited(arguments);
this.foo.innerHTML = this.bar;
}
});
The HTML elements given attach point attributes will become class members in the widget code. You can then use the templated widget like so:
new mycode.mysnippet("A cup of tea would restore my normality.").placeAt(someOtherDomElement);
A nice feature is that if you use dojo.cache and Dojo's build system, it will insert the HTML template text into the javascript code, so that the client doesn't have to make a separate request.
This may of course be way too bloated for your use case, but I find it quite useful - and since you asked for techniques, there's mine. Sitepoint has a nice article on it too.

There are many possible techniques. Perhaps the most obvious is to have all elements on the page but have them hidden - then your JS can simply unhide them/show them as required. This may not be possible though for certain situations. What if you need to add a number (unspecified) of duplicate elements (or groups of elements)? Then perhaps have the elements in question hidden and using something like jQuery's clone function insert them as required into the DOM.
Alternatively if you really have to build HTML on the fly then definitely make your own class to handle it so you don't have snippets scattered through your code. You could employ jQuery literal creators to help do this.

I'm not sure if it qualifies as a "technique", but I generally tend to avoid constructing blocks of HTML in JavaScript by simply loading the relevant blocks from the back-end via AJAX and using JavaScript to swap them in and out/place them as required. (i.e.: None of the low-level text shuffling is done in JavaScript - just the DOM manipulation.)
Whilst you of course need to allow for this during the design of the back-end architecture, I can't help but think to leads to a much cleaner set up.

Sometimes I utilise a custom method to return a node structure based on provided JSON argument(s), and add that return value to the DOM as required. It ain't accessible once JS is unavailable like some backend solutions could be.

After reading some of the responses I managed to come up with my own solution using Python/Django and jQuery.
I have the HTML snippet as a Django template:
<div class="marker_info">
<p> _info_ </p>
more info...
</div>
In the view, I use the Django method render_to_string to load the templates as strings stored in a dictionary:
snippets = { 'marker_info': render_to_string('templates/marker_info_snippet.html')}
The good part about this is I can still use the template tags, for example, the url function. I use simplejson to dump it as JSON and pass it into the full template. I still wanted to dynamically replace strings in the JavaScript code, so I wrote a function to replace words surrounded by underscores with my own variables:
function render_snippet(snippet, dict) {
for (var key in dict)
{
var regex = new RegExp('_' + key + '_', 'gi');
snippet = snippet.replace(regex, dict[key]);
}
return snippet;
}

JavaScript multiline strings and templating?

I have been wondering if there is a way to define multiline strings in JavaScript like you can do in languages like PHP:
var str = "here
goes
another
line";
Apparently this breaks up the parser. I found that placing a backslash \ in front of the line feed solves the problem:
var str = "here\
goes\
another\
line";
Or I could just close and reopen the string quotes again and again.
The reason why I am asking because I am making JavaScript based UI widgets that utilize HTML templates written in JavaScript. It is painful to type HTML in strings especially if you need to open and close quotes all the time. What would be a good way to define HTML templates within JavaScript?
I am considering using separate HTML files and a compilation system to make everything easier, but the library is distributed among other developers so that HTML templates have to be easy to include for the developers.

No thats basically what you have to do to do multiline strings.
But why define the templates in javascript anwyay? why not just put them into a file and have a ajax call load them up in a variable when you need them?
For instantce (using jquery)
$.get('/path/to/template.html', function(data) {
alert(data); //will alert the template code
});

#slebetman, Thanks for the detailed example.
Quick comment on the substitute_strings function.
I had to revise
str.replace(n,substitutions[n]);
to be
str = str.replace(n,substitutions[n]);
to get it to work. (jQuery version 1.5? - it is pure javascript though.)
Also when I had below situation in my template:
$CONTENT$ repeated twice $CONTENT$ like this
I had to do additional processing to get it to work.
str = str.replace(new RegExp(n, 'g'), substitutions[n]);
And I had to refrain from $ (regex special char) as the delimiter and used # instead.
Thought I would share my findings.

There are several templating systems in javascript. However, my personal favorite is one I developed myself using ajax to fetch XML templates. The templates are XML files which makes it easy to embed HTML cleanly and it looks something like this:
<title>This is optional</title>
<body><![CDATA[
HTML content goes here, the CDATA block prevents XML errors
when using non-xhtml html.
<div id="more">
$CONTENT$ may be substituted using replace() before being
inserted into $DOCUMENT$.
</div>
]]></body>
<script><![CDATA[
/* javascript code to be evaled after template
* is inserted into document. This is to get around
* the fact that this templating system does not
* have its own turing complete programming language.
* Here's an example use:
*/
if ($HIDE_MORE$) {
document.getElementById('more').display = 'none';
}
]]></script>
And the javascript code to process the template goes something like this:
function insertTemplate (url_to_template, insertion_point, substitutions) {
// Ajax call depends on the library you're using, this is my own style:
ajax(url_to_template, function (request) {
var xml = request.responseXML;
var title = xml.getElementsByTagName('title');
if (title) {
insertion_point.innerHTML += substitute_strings(title[0],substitutions);
}
var body = xml.getElementsByTagName('body');
if (body) {
insertion_point.innerHTML += substitute_strings(body[0],substitutions);
}
var script = xml.getElementsByTagName('script');
if (script) {
eval(substitute_strings(script[0],substitutions));
}
});
}
function substitute_strings (str, substitutions) {
for (var n in substitutions) {
str.replace(n,substitutions[n]);
}
return str;
}
The way to call the template would be:
insertTemplate('http://path.to.my.template', myDiv, {
'$CONTENT$' : "The template's content",
'$DOCUMENT$' : "the document",
'$HIDE_MORE$' : 0
});
The $ sign for substituted strings is merely a convention, you may use % of # or whatever delimiters you prefer. It's just there to make the part to be substituted unambiguous.
One big advantage to using substitutions on the javascript side instead of server side processing of the template is that this allows the template to be plain static files. The advantage of that (other than not having to write server side code) is that you can then set the caching policy for the template to be very aggressive so that the browser only needs to fetch the template the first time you load it. Subsequent use of the template would come from cache and would be very fast.
Also, this is a very simple example of the implementation to illustrate the mechanism. It's not what I'm using. You can modify this further to do things like multiple substitution, better handling of script block, handle multiple content blocks by using a for loop instead of just using the first element returned, properly handling HTML entities etc.
The reason I really like this is that the HTML is simply HTML in a plain text file. This avoids quoting hell and horrible string concatenation performance issues that you'll usually find if you directly embed HTML strings in javascript.

I think I found a solution I like.
I will store templates in files and fetch them using AJAX. This works for development stage only. For production stage, the developer has to run a compiler once that compiles all templates with the source files. It also compiles JavaScript and CSS to be more compact and it compiles them to a single file.
The biggest problem now is how to educate other developers doing that. I need to build it so that it is easy to do and understand why and what are they doing.

You could also use \n to generate newlines. The html would however be on a single line and difficult to edit. But if you generate the JS using PHP or something it might be an alternative

How safe is it use document.body.innerHTML.replace?

Is running something like:
document.body.innerHTML = document.body.innerHTML.replace('old value', 'new value')
dangerous?
I'm worried that maybe some browsers might screw up the whole page, and since this is JS code that will be placed on sites out of my control, who might get visited by who knows what browsers I'm a little worried.
My goal is only to look for an occurrence of a string in the whole body and replace it.

Definitely potentially dangerous - particularly if your HTML code is complex, or if it's someone else's HTML code (i.e. its a CMS or your creating reusable javascript). Also, it will destroy any eventlisteners you have set on elements on the page.
Find the text-node with XPath, and then do a replace on it directly.
Something like this (not tested at all):
var i=0, ii, matches=xpath('//*[contains(text(),"old value")]/text()');
ii=matches.snapshotLength||matches.length;
for(;i<ii;++i){
var el=matches.snapshotItem(i)||matches[i];
el.wholeText.replace('old value','new value');
}
Where xpath() is a custom cross-browser xpath function along the lines of:
function xpath(str){
if(document.evaluate){
return document.evaluate(str,document,null,6,null);
}else{
return document.selectNodes(str);
}
}

I agree with lucideer, you should find the node containing the text you're looking for, and then do a replace. JS frameworks make this very easy. jQuery for example has the powerful :contains('your text') selector
http://api.jquery.com/contains-selector/

If you want rock solid solution, you should iterate over DOM and find value to replace that way.
However, if 'old value' is a long string that never could be mixed up with tag, attribute or attbibute value you are relatively safe by just doing replace.

Making a URL W3C valid AND work in Ajax Request

I have a generic function that returns URLs. (It's a plugin function that returns URLs to resources [images, stylesheets] within a plugin).
I use GET parameters in those URLs.
If I want to use these URLs within a HTML page, to pass W3C validation, I need to mask ampersands as &
/plugin.php?plugin=xyz&resource=stylesheet&....
but, if I want to use the URL as the "url" parameter for a AJAX call, the ampersand is not interpreted correctly, screwing up my calls.
Can I do something get & work in AJAX calls?
I would very much like to avoid adding parameters to th URL generating function (intendedUse="ajax" or whatever) or manipulating the URL in Javascript, as this plugin model will be re-used many times (and possibly by many people) and I want it as simple as possible.

It seems to me that you're running into the problem of having one piece of your application cross multiple layers. In this case it's the plugin.
A URL as specified by RFC 1738 states that a URL should use a & token to separate key/value pairs from one another. However ampersand is a reserved token in HTML and therefore should be escaped into &. Since escaping the ampersands is an artifact of HTML, your plugin should probably not be escaping them directly. Instead you should have a function or something that escapes a canonical URL so that it can be embedded in HTML markup.

The only place that this is likely to actually happen is if you are:
Using XHTML
Serving it as text/html
Using inline <script>
This is not a happy combination, and the solution is in the spec.
Use external scripts if your script
uses < or & or ]]> or --.
The XHTML media types note includes the same advice, but also provides a workaround if you choose to ignore it.

Try returning JSON instead of just a string, that way your Javascript can read the URL value as an object, and you shouldn't have that issue. Other than that, try simply HTML decoding the string, using something like:
function escapeHTML (str)
{
var div = document.createElement('div');
var text = document.createTextNode(str);
div.appendChild(text);
return div.innerHTML;
};
Obviously you'll want to make sure you remove any reference to DOM elements you might create (which I've not done here to simplify the example).
I use this technique in the AJAX sites I create at my work and have used it many times to solve this problem.

When you have markup of the form:
<a href="?a=1&b=2">
Then the value of the href attribute is ?a=1&b=2. The & is only an escape sequence in HTML/XML and doesn't affect the value of the attribute. This is similar to:
<a href="<>">
Where the value of the attribute is <>.
If, instead, you have code of the form:
<script>
var s = "?a=1&b=2";
</script>
Then you can use a JavaScript function:
<script>
var amp = String.fromCharCode(38);
var s = "?a=1"+amp+"b=2";
</script>
This allows code that would otherwise only be valid HTML or only valid XHTML to be valid in both. (See Dorwald's comments for more info.)

We Keep Coding

JavaScript is the programming language of the Web.

What is dangerous in an HTML textarea? - javascript

Related

Using MVC resource in Jquery - function stops working

Techniques to avoid building HTML strings in JavaScript?

JavaScript multiline strings and templating?

How safe is it use document.body.innerHTML.replace?

Making a URL W3C valid AND work in Ajax Request

Categories

Resources