I have a usecase where the contents of the mustache HTML template could potentially come from the application/end-user (i.e. The content of the script tag in the below code snippet.)
<script id="template" type="x-tmpl-mustache">
Hello {{ name }}!
</script>
As this could potentially lead to execution of malicious code, I'm doing
Allowing only a subset of HTML tags and attributes to be added in
the template (inside the script tag)
Allowing only HTML escaped
variables i.e. only {{name}} is allowed and not {{{name}}}.
Is there anything further that needs to be considered for security of the application?
I think it's not a "moustache" problem if we follow the philosophy "small, sharp tools". Then before mapping an unsecure data (third party JSON) to template you are to validate the data with other tools.
The simplest approach to start with is to replace string fields, contaning unsecure data.
function clearJson(userStringData){
return JSON.parse(userStringData, function(k,v) {
// string values containg something like
// html tags or js block braces will not pass
return String(v).match('[<>{}]') ? 'UNSAFE' : v;
});
}
The field of code injection is too wide to have a short answer on your question. You may apply any approach which is advanced enough for your application: define data formats the application expects from user and then in runtime remove incoming suspicious data not matching these formats.
You should perform user inputs on the server, not "only" on the client. If some "bad code" is going to be executed on the client, it's already too late ;)
Related
In a Freemarker template on a page with Angular, I have the following:
...
ng-init="somevariable = ${(model.usercontrolledstring)}"
...
I want to make sure this is hardened against XSS, so I've set up some escaping rules. However, the following value for model.usercontrolledstring causes JavaScript to execute:
abc'+constructor.constructor('alert(1)')()+'abc
The surprising thing is that when the client receives it, it arrives thusly:
ng-init="somevariable = 'abc'+constructor.constructor('alert(1)')()+'abc'"
So it looks like it's being escaped correctly, but Angular is still deciding to run it!
So I guess my questions would be:
What am I not understanding about Angular? (In particular, its decision to run after decoding html entities)
Is there a proper way of configuring a Freemarker Template to prevent this sort of XSS?
I believe you should use somevariable = '${model.usercontrolledstring?jsString}' there.
Also, if that thing goes into a <script> block, certainly you shouldn't apply HTML escaping there. It's not decoded by the browser inside <script>, so you end up with string values that literally contain '. Unless the string meant to contain HTML as opposed to plain text, that's wrong.
I'm doing a forum like web app. Users are allowed to submit rich html text to server such as p tag, div tag, etc. In order to keep the format, server will write these tags back to the users' browser directly(without html encoded). So, I must do a potential dangerous script check to avoid XSS. Any JavaScript code is supposed to be dangerous and not allowed. So, How to detect them or any other better solution?
dangerous example 1:
<script>alert('1')</script>
dangerous example 2:
<script src="..."></script>
dangerous example 3:
click me
Use an HTML Parser
Your requirements are straightforward:
You must disallow all <script> tags, but keep certain rich HTML tags.
You must be able to escape inline Javascript in links. i.e. stringify it or strip the unsafe attributes altogether.
The correct way to handle all of these is to employ a modern standards-compliant HTML parser that is able to syntactically analyse the structure of the rich HTML sent over, identifying the tags sent over and discovering the raw values in attributes. This is, in fact, how sanitisation, as one of the comments mentions, is done.
There are a number of pre-existing HTML parsers that are designed to target XSS-unsafe input. The npm library js-xss, for example, appears to be able to do exactly what you want:
Whitelisting only specific tags
Modify unsafe attributes to return a default value
You can even run this server-side as a command line utility.
Similar libraries already exist for most languages, and you should do a thorough search of your preferred language's package repository. Alternatively, you can launch a subprocess and collect your results directly from js-xss from the command line.
Avoid using regular expressions to parse HTML naively - while it is true most HTML parsers end up using regular expressions under the hood, they do so in a fairly limited fashion for strictly well-defined grammars after correctly lexing them.
Use this regex
<script([^'"]|"(\\.|[^"\\])*"|'(\\.|[^'\\])*')*?<\/script>
for detecting all types of <script> tag
but I suggest using a iframe in sandbox mode to show ALL html code, by doing that you prevent javascript code from being able to do anything bad.
http://www.w3schools.com/tags/att_iframe_sandbox.asp
I hope this helps!
Considering issues like CSRF, XSS, SQL Injection...
Site: ASP.net, SQL Server 2012
I'm reading a somewhat old page from MS: https://msdn.microsoft.com/en-us/library/ff649310.aspx#paght000004_step4
If I have a parametrized query, and one of my fields is for holding HTML, would a simple replace on certain tags do the trick?
For example, a user can type into a WYSIWYG textarea, make certain things bold, or create bullets, etc.
I want to be able to display the results from a SELECT query, so even if I HTMLEncoded it, it'll have to be HTMLDecoded.
What about a UDF that cycles through a list of scenarios? I'm curious as to the best way to deal with the seemingly sneaky ones mentioned on that page:
Quote:
An attacker can use HTML attributes such as src, lowsrc, style, and href in conjunction with the preceding tags to inject cross-site scripting. For example, the src attribute of the tag can be a source of injection, as shown in the following examples.
<img src="javascript:alert('hello');">
<img src="java
script:alert('hello');">
<img src="java
script:alert('hello');">
An attacker can also use the <style> tag to inject a script by changing the MIME type as shown in the following.
<style TYPE="text/javascript">
alert('hello');
</style>
So ultimately two questions:
Best way to deal with this from within the INSERT statement itself.
Best way to deal with this from code-behind.
Best way to deal with this from within the INSERT statement itself.
None. That's not where you should do it.
Best way to deal with this from code-behind.
Use a white-list, not a black-list. HTML encode everything, then decode specific tags that are allowed.
It's reasonable to be able to specify some tags that can be used safely, but it's not reasonable to be able to catch every possible exploit.
What HTML tags would be considered dangerous if stored in SQL Server?
None. SQL Server does not understand, nor try to interpret HTML tags. A HTML tag is just text.
However, HTML tags can be dangerous if output to a HTML page, because they can contain script.
If you want a user to be able to enter rich text, the following approaches should be considered:
Allow users (or the editor they are using) to generate BBCode, not HTML directly. When you output their BBCode markup, you convert any recognised tags to HTML without attributes that contain script, and any HTML to entities (& to &, etc).
Use a tried and tested HTML sanitizer to remove "unsafe" markup from your stored input in combination with a Content Security Policy. You must do both otherwise any gaps (and there will be gaps) in the sanitizer could allow an attack, and not all browsers full support CSP yet (IE).
Note that these should be both be done on point of output. Store the text "as is" in your database, simply encode and process for the correct format when output to the page.
Sanitize html both on the client and on the server before you stuff any strings into SQL.
Client side:
TinyMCE - does this automatically
CKEditor - does this automatically
Server side:
Pretty easy to do this with Node, or the language/platform of your choice.
https://www.realwebsite.com
the link above shows www.realwebsite.com while it actually takes you to www.dangerouswebsite.com...
<a '
href="https://www.dangerouswebsite.com">
https://www.realwebsite.com
<'/a>
do not include the random ' in the code I put it there to bypass activating the code so you can see the code instead of just the link. (btw most websites block this or anything if you add stuff like onload="alert('TEXT')" but it can still be used to trick people into going to dangerous websites... (although its real website pops up on the bottom of your browser, some people don't check it or don't understand what it means.))
I'm creating an app that retrieves the text within a tweet, store it in the database and then display it on the browser.
The problem is that I'm thinking if the text has PHP tags or HTML tags it might be a security breach there.
I looked into strip_tags() but saw some bad reviews. I also saw suggestions to HTML Purifier but it was last updated years ago.
So my question is how can I be 100% secure that if the tweet text is "<script> something_bad() </script>" it won't matter?
To state the obvious the tweets are sent to the database from users so I don't want to check all individually before displaying them.
You are NEVER 100% secure, however you should take a look at this. If you use ENT_QUOTES parameter too, currently there are no ways to inject ANY XSS on your website if you're using valid charset (and your users don't use outdated browsers). However, if you want to allow people to only post SOME html tags into their "Tweet" (for example <b> for bold text), you will need to take a deep look at EACH whitelisted tag.
You've passed the first stage which is to recognise that there is a potential issue and skipped straight to trying to find a solution, without stopping to think about how you want to deal the scenario of the content. This is a critical pre-cusrsor to solving the problem.
The general rule is that you validate input and escape output
validate input
- decide whether to accept or reject it it in its entirety)
if (htmlentities($input) != $input) {
die "yuck! that tastes bad";
}
escape output
- transform the data appropriately according to where its going.
If you simply....
print "<script> something_bad() </script>";
That would be bad, but....
print JSONencode(htmlentities("<script> something_bad() </script>"));
...then you'd would have done something very strange at the front end to make the client susceptivble to a stored XSS attack.
If you're outputting to HTML (and I recommend you always do), simply HTML encode on output to the page.
As client script code is only dangerous when interpreted by the browser, it only needs to be encoded on output. After all, to the database <script> is just text. To the browser <script> tells the browser to interpret the following text as executable code, which is why you should encode it to <script>.
The OWASP XSS Prevention Cheat Sheet shows how you should do this properly depending on output context. Things get complicated when outputting to JavaScript (you may need to hex encode and HTML encode in the right order), so it is often much easier to always output to a HTML tag and then read that tag using JavaScript in the DOM rather than inserting dynamic data in scripts directly.
At the very minimum you should be encoding the < & characters and specifying the charset in metatag/HTTP header to avoid UTF7 XSS.
You need to convert the HTML characters <, > (mainly) into their HTML equivalents <, >.
This will make a < and > be displayed in the browser, but not executed - ie: if you look at the source an example may be <script>alert('xss')</script>.
Before you input your data into your database - or on output - use htmlentities().
Further reading: https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
I'm developing a facebook app right now all by my lonesome. I'm attempting to make a javascript call on an onclick event. In this onclick event, I'm populating some arguments (from the server side in php) based on that item that is being linked. I'm inserting a little bit of JSON and some other stuff with funky characters.
Facebook expects all the attribute fields of an anchor to be strictly alphanumeric. No quotes, exclamation marks, anything other than 0-9a-Z_. So it barfs on the arguments I want to pass to my javascript function (such as JSON) when the user clicks that link.
So I thought, why don't I use my templating system to just autogenerate the javascript? For each link I want to generate, I generate a unique javascript function (DoItX where X is a unique integer for this page). Then instead of trying to pass arguments to my javascript function via onclick, I will insert my arguments as local variables for DoX. On link "X" I just say onclick="DoX()".
So I did this and viola it works! (it also helps me avoid the quote escaping hell I was in earlier). But I feel icky.
My question is, am I nuts? Is there an easier way to do this? I understand the implications that somehow somebody was able to change my templated local variable, ie:
var local = {TEMPLATED FIELD};
into something with a semicolon, inserting arbitrary javascript to the client. (and I'm trying to write code to be paranoid of this).
When is it ok (is it ever ok) to generate javascript from the server? Anything I should look out for/best practices?
Depending on your application generating JavaScript in your templating language can save a lot of time but there are pitfalls to watch out for. The most serious one being that it gets really hard to test your JavaScript when you don't have your full templating stack available.
One other major pitfall is that it becomes tempting to try and 'abstract' JavaScript logic to some higher level classes. Usually this is a sign that you will be shaving yaks in your project. Keep JavaScript login in JavaScript.
Judging from the little bit of information you have given it your solution seems sensible.
If you must generate javascript, I would suggest only generating JSON and having all functions be static.
It more cleanly separates the data, and it also makes it easier to validate to prevent XSS and the like.
JS generated from server is used in lots of areas. The following is the sample from a ASP.NET page where the JS script is generated by the framework:
<script src="/WebResource.axd?d=9h5pvXGekfRWNS1g8hPVOQ2&t=633794516691875000" type="text/javascript"></script>
Try to have reusable script functions that don't require regeneration; and 'squeeze' out the really dynamic ones for server-side generation.
If you want to feel better about it, make sure that most of your JavaScript is in separate library files that don't get generated, and then, when you generate code, generate calls to those libraries rather than generating extensive amounts of JavaScript code.
it's fine to generate JS from the server. just bear in mind not to fine too big a page from the server.
Generally speaking I avoid ever automatically generating JavaScript from a server-side language, though I do however; create JavaScript variables that are initialized from server-side variables that my JavaScript will use. This makes testing and debugging much simpler.
In your case I may create local variables like the following which is easy to test:
<script type='text/javascript' language='javascript'>
<!--
var FUNC_ARG_X = <%= keyX %>;
var FUNC_ARG_Y = <%= keyY %>;
var FUNC_ARG_Z = <%= keyZ %>;
//-->
</script>
<script type='text/javascript' language='javascript'>
<!--
function DoCleanCall(arg) {
// Whatever logic here.
}
//-->
</script>
now in your markup use:
<a href='#' onclick='DoCleanCall(FUNC_ARG_X);'>Test</a>
Now of course you could have embedded the server-side variable on the <a/> tag, however it is sometimes required that you refer to these values from other parts of your JavaScript.
Notice also how the generated content is in it's own <script> tag, this is deliberate as it prevents parsers from failing and telling you that you have invalid code for every reference you use it in (as does ASP.NET), it will still fail on that section only however.