I currently have a simple <div contenteditable="true"> working, but, here's my problem.
Currently, the user can create a persistent XSS by inserting a <script> into the div, which I definitely do not want.
However, my current ideas to fix this are:
Allow only a and img tags
Use a textarea (not a good idea, because then have users copy and paste images)
What do you guys suggest?
You have to keep in mind that to prevent xss, you've GOT TO DO IT ON THE SERVER SIDE. If your rich text editor (ex YUI or tinyMCE) has some javascript to prevent a script tag from being inputted, that doesn't stop me from inspecting your http post requests, looking at the variable names you're using, and then using firefox poster to send whatever string I like to your server to bypass all client side validation. If you aren't validating user input SERVER SIDE then you're doing almost nothing productive to protect from XSS.
Any client side xss protection would have to do with how you render user input; not how you receive it. So, for example, if you encoded all input so it does not render as html. This goes away from what you want to accomplish though (just anchor and img tags). Just keep in mind the more you allow to be rendered the more possible vulnerabilities you expose.
That being said the bulk of your protection should come from the server side and there are a lot of XSS filters out there depending on what you're writing with (ex, asp.net or tomcat/derby/jboss) that you can look into.
I think you're on the right path by allowing ONLY a and img tags. The one thing you have to keep in mind is that you can put javascript commands into the src attributes of a tags, so take care to validate the href attributes. But the basic idea of "allow nothing and then change the filters to only allow certain things" (AKA whitelist filtering) is better than "allow everything and then filter out what I don't want" (AKA blacklist filtering).
In the comments below, Brian Nickel also said this which illustrates the point:
Everything but the elements and attributes you want to keep. I
know you mentioned it in your answer but that bears repeating since it
is so scary. <img onerror="stealMoney()">
The other thing you're going to want to do is define a XSSFilterRequest object (or something along those lines) and in a filter, override your requests so that any call to whatever your "getUrlParameter" and "getRequestParameter" objects run the request values through your xss filter. This provides a clean way to filter everything without rewriting existing code.
EDIT: A python example of xss filtering:
Python HTML sanitizer / scrubber / filter
Python library for XSS filtering?
What about using google caja (a source-to-source translator for securing Javascript-based web content)?
Unless you have xss validation on server side you could apply html_sanitize both to data sent from the user and data received from the server that is to be displayed. In worst case scenario you'll get XSSed content in database that will never be displayed to the user.
Related
Considering issues like CSRF, XSS, SQL Injection...
Site: ASP.net, SQL Server 2012
I'm reading a somewhat old page from MS: https://msdn.microsoft.com/en-us/library/ff649310.aspx#paght000004_step4
If I have a parametrized query, and one of my fields is for holding HTML, would a simple replace on certain tags do the trick?
For example, a user can type into a WYSIWYG textarea, make certain things bold, or create bullets, etc.
I want to be able to display the results from a SELECT query, so even if I HTMLEncoded it, it'll have to be HTMLDecoded.
What about a UDF that cycles through a list of scenarios? I'm curious as to the best way to deal with the seemingly sneaky ones mentioned on that page:
Quote:
An attacker can use HTML attributes such as src, lowsrc, style, and href in conjunction with the preceding tags to inject cross-site scripting. For example, the src attribute of the tag can be a source of injection, as shown in the following examples.
<img src="javascript:alert('hello');">
<img src="java
script:alert('hello');">
<img src="java
script:alert('hello');">
An attacker can also use the <style> tag to inject a script by changing the MIME type as shown in the following.
<style TYPE="text/javascript">
alert('hello');
</style>
So ultimately two questions:
Best way to deal with this from within the INSERT statement itself.
Best way to deal with this from code-behind.
Best way to deal with this from within the INSERT statement itself.
None. That's not where you should do it.
Best way to deal with this from code-behind.
Use a white-list, not a black-list. HTML encode everything, then decode specific tags that are allowed.
It's reasonable to be able to specify some tags that can be used safely, but it's not reasonable to be able to catch every possible exploit.
What HTML tags would be considered dangerous if stored in SQL Server?
None. SQL Server does not understand, nor try to interpret HTML tags. A HTML tag is just text.
However, HTML tags can be dangerous if output to a HTML page, because they can contain script.
If you want a user to be able to enter rich text, the following approaches should be considered:
Allow users (or the editor they are using) to generate BBCode, not HTML directly. When you output their BBCode markup, you convert any recognised tags to HTML without attributes that contain script, and any HTML to entities (& to &, etc).
Use a tried and tested HTML sanitizer to remove "unsafe" markup from your stored input in combination with a Content Security Policy. You must do both otherwise any gaps (and there will be gaps) in the sanitizer could allow an attack, and not all browsers full support CSP yet (IE).
Note that these should be both be done on point of output. Store the text "as is" in your database, simply encode and process for the correct format when output to the page.
Sanitize html both on the client and on the server before you stuff any strings into SQL.
Client side:
TinyMCE - does this automatically
CKEditor - does this automatically
Server side:
Pretty easy to do this with Node, or the language/platform of your choice.
https://www.realwebsite.com
the link above shows www.realwebsite.com while it actually takes you to www.dangerouswebsite.com...
<a '
href="https://www.dangerouswebsite.com">
https://www.realwebsite.com
<'/a>
do not include the random ' in the code I put it there to bypass activating the code so you can see the code instead of just the link. (btw most websites block this or anything if you add stuff like onload="alert('TEXT')" but it can still be used to trick people into going to dangerous websites... (although its real website pops up on the bottom of your browser, some people don't check it or don't understand what it means.))
I wrote a web application that fetches email via IMAP. I now need to display these emails to the user.
I thought it would be simple (I am displaying HTML within an HTML-capable browser) until I looked into this a little... and discovered that there are tons of issues, such as:
Javascript & security
Style breaking
Surely more
Is there a good, safe way to display an HTML email? I would err for "safe" rather than "gorgeous", even though I don't want to display just the text version of an email (which is not even guaranteed to be there anyway...)
I realise the most obvious answer is "put everything in a frame" -- is that really it though? Will it actually work?
I am using Node server side if it helps...
..most obvious answer is "put everything in a frame"...will it actually work?
Yes, e.g. Whiteout Networks GmbH's WHITEOUT.IO does it in /src/tpl/read.html and /src/js/controller/read-sandbox.js. Some of the security issues are handled by DOMPurify
..there are tons of issues..Is there a good, safe way..?
I know the message data format also under names EML or MHTML so looking for a good "XY to HTML converter" or "HTML5 document viewer with XY support" may point you to a usable results (e.g. GroupDocs.Viewer)
Some e-mail clients (e.g. GMail) don't use iframe, instead they use a mail parser (e.g. andris9/mailparser) and a HTML parser (e.g. cheeriojs/cheerio) to extract an e-mail-safe-html subset (see Stack Overflow: What guidelines for HTML email design are there? and Stack Overflow: Styling html email for GMail for some examples) or use a HTML sanitizer (e.g. Google's Caja, cure53/DOMPurify) and embed the code directly into the page.
But it is not always an easy thing, there is no consensus on what constitutes the e-mail-safe-html subset and you certainly don't wont to inline possibly infected attachments nor run anonymous CORS scripts within the secured user's session.
Anyway, as always, studying source code of various e-mail clients (see Wikipedia: Comparison of email clients) is the way to find out..
This morning I woke up to a JavaScript alert on a project of mine that runs KnockoutJS, jQuery, and Underscore.js. It says "I can run any JavaScript of my choice on your users' browsers". The only third-party JavaScript I am downloading is Typekit, and removing that does not make this go away. I've searched my JavaScript and vendor JavaScript and this string does not come back up matching anything.
How would you troubleshoot this and/or is this something that is known to occur?
If you have a database for your application, that would be the next place to check. I'm guessing somebody found and exploited an Injection vulnerability (either un-sanitized HTML input or SQL) and injected the script into a page via the database.
The last place would be to look at the ruby code to see if somehow a malicious user modified your source.
You obviously take an input from user and then outputting it back as part of HTML without quoting or sanitizing. There's two quick checks to do:
1) Open source of page that outputs this alert and search inside source for exact text of alert - this should give you clear indication of what user-filled field is compromised.
2) To be sure, search all other fields in your database generated by users (login names, text of comments, etc.) for words "script" and "alert".
For future: always sanitize your input (remove HTML tags) before inserting it in HTML page OR escape symbols as entities according to standards OR explicitly treat is a plain text by assigning it to value of text node in DOM.
It sounds like a hack attempt on your site. Check any databases, text files, etc. that are being used that are receiving user input. It sounds like you're not checking what's being posted to your server I'm guessing.
I'm xss-proofing my web site for javascript and xss attacks. It's written in ASP.NET Webforms.
The main part I'd like to test is a user control that has a textbox (tinyMCE attached to it).
Users can submit stories to site by writing in this textbox. I had to set validateRequest to false since I want to get users' stories in HMTL (tinyMCE).
How should I prevent javascript-xss attacks? Since users' stories are HMTL texts, I cannot use Server.HtmlEncode on their stories. In general, what's the safe way to receive HTML content from user, save and then display it to users?
If one user puts malicious code in the textbox and submits it, is there a chance that this could harm other people who view that text?
Thanks.
If you don't clean what the user puts in the textbox and submits, then yes, there is a chance for harm to be done.
You might want to check out the Microsoft Anti-Cross Site Scripting Library, as it is designed to help developers prevent just such attacks.
Also worth taking a look at is OWASP's Cross-site Scripting (XSS)
You might want to look into HttpUtility.HtmlEncode and HttpUtility.HtmlDecode as well. I just wrote a quick test, and it looks like it might address your concern in the comment below (about how to display the data to other users in the right format):
string htmlString = "<b>This is a test string</b><script>alert(\"alert!\")</script> and some other text with markup <ol><li>1234235</li></ol>";
string encodedString = HttpUtility.HtmlEncode(htmlString);
// result = <b>This is a test string</b><script>alert("alert!")</script> and some other text with markup <ol><li>1234235</li></ol>
string decodedString = HttpUtility.HtmlDecode(encodedString);
// result = <b>This is a test string</b><script>alert("alert!")</script> and some other text with markup <ol><li>1234235</li></ol>
ASP.NET Controls and HTMLEncode
I was going to post the information I had from my class, but I found a link that lists the exact same thing (for 1.1 and 2.0), so I'll post the link for easier reference. You can probably get more information on a specific control not listed (or 3.0/3.5/4.0 versions if they've changed) by looking on MSDN, but this should serve as a quick start guide for you, at least. Let me know if you need more information and I'll see what I can find.
ASP.NET Controls Default HTML Encoding
Here's a more comprehensive list from one of the MSDN blogs: Which ASP.NET Controls Automatically Encodes?
I would go with storing it encoded in database, then when showing Decode it and replace only the < with < if you say you need to preserve other things.
As far as I know, if you replace the < XSS is not really possible as any JS code must be inside <script> tags to be executed and by replacing, you'll get this in the HTML source:
<script> and the user will see <script> on the screen as the browser will parse the < entity.
This said, if you allow users to post "raw" HTML e.g. <b>this section is bolded</b> then you'll have to create "white list" of allowed tags then manually replace the < with the proper HTML for example:
string[] allowedTags = new string[] { "a", "b", "img" };
foreach (allowedTag in allowedTags)
output = output.Replace("<" + allowedTag, "<" + allowedTag);
Have you seen the OWASP guide on this
The best way would be to have an white list of allowed tags instead of a trying to come up with a way to prevent all script tags.
One solution on how to do this is here How do I filter all HTML tags except a certain whitelist?
But you also need to be aware people might have a link to external script via an image tag with a URL to their own server. See examples here http://ha.ckers.org/xss.html of the different types of attacks you need to defend against
I'll be inserting content from remote sources into a web app. The sources should be limited/trusted, but there are still a couple of problems:
The remote sources could
1) be hacked and inject bad things
2) overwrite objects in my global names
space
3) I might eventually open it up for users to enter their own remote source. (It would be up to the user to not get in trouble, but I could still reduce the risk.)
So I want to neutralize any/all injected content just to be safe.
Here's my plan so far:
1) find and remove all inline event handlers
str.replace(/(<[^>]+\bon\w+\s*=\s*["']?)/gi,"$1return;"); // untested
Ex.
<a onclick="doSomethingBad()" ...
would become
<a onclick="return;doSomethingBad()" ...
2) remove all occurences of these tags:
script, embed, object, form, iframe, or applet
3) find all occurences of the word script within a tag
and replace the word script with html entities for it
str.replace(/(<[>+])(script)/gi,toHTMLEntitiesFunc);
would take care
<a href="javascript: ..."
4) lastly any src or href attribute that doesn't start with http, should have the domain name of the remote source prepended to it
My question: Am I missing anything else? Other things that I should definitely do or not do?
Edit: I have a feeling that responses are going to fall into a couple camps.
1) The "Don't do it!" response
Okay, if someone wants to be 100% safe, they need to disconnect the computer.
It's a balance between usability and safety.
There's nothing to stop a user from just going to a site directly and being exposed. If I open it up, it will be a user entering content at their own risk. They could just as easily enter a given URL into their address bar as in my form. So unless there's a particular risk to my server, I'm okay with those risks.
2) The "I'm aware of common exploits and you need to account for this ..." response ... or You can prevent another kind of attack by doing this ... or What about this attack ...?
I'm looking for the second type unless someone can provide specific reasons why my would be more dangerous than what the user can do on their own.
Instead of sanitizing (black listing). I'd suggest you setup a white list and ONLY allow those very specific things.
The reason for this is you will never, never, never catch all variations of malicious script. There's just too many of them.
don't forget to also include <frame> and <frameset> along with <iframe>
for the sanitization thing , are you looking for this?
if not, perhaps you could learn a few tips from this code snippet.
But, it must go without saying that prevention is better than cure. You had better allow only trusted sources, than allow all and then sanitize.
On a related note, you may want to take a look at this article, and its slashdot discussion.
It sounds like you want to do the following:
Insert snippets of static HTML into your web page
These snippets are requested via AJAX from a remote site.
You want to sanitise the HTML before injecting into the site, as this could lead to security problems like XSS.
If this is the case, then there are no easy ways to strip out 'bad' content in JavaScript. A whitelist solution is the best, but this can get very complex. I would suggest proxying requests for the remote content through your own server and sanitizing the HTML server side. There are various libraries that can do this. I would recommend either AntiSamy or HTMLPurifier.
For a completely browser-based way of doing this, you can use IE8's toStaticHTML method. However no other browser currently implements this.